org.apache.nutch.scoring.webgraph
Class LinkDumper
java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.nutch.scoring.webgraph.LinkDumper
- All Implemented Interfaces:
- Configurable, Tool
public class LinkDumper
- extends Configured
- implements Tool
The LinkDumper tool creates a database of node to inlink information that can
be read using the nested Reader class. This allows the inlink and scoring
state of a single url to be reviewed quickly to determine why a given url is
ranking a certain way. This tool is to be used with the LinkRank analysis.
Nested Class Summary |
static class |
LinkDumper.Inverter
Inverts outlinks from the WebGraph to inlinks and attaches node
information. |
static class |
LinkDumper.LinkNode
Bean class which holds url to node information. |
static class |
LinkDumper.LinkNodes
Writable class which holds an array of LinkNode objects. |
static class |
LinkDumper.Merger
Merges LinkNode objects into a single array value per url. |
static class |
LinkDumper.Reader
Reader class which will print out the url and all of its inlinks to system
out. |
Method Summary |
void |
dumpLinks(Path webGraphDb)
Runs the inverter and merger jobs of the LinkDumper tool to create the
url to inlink node database. |
static void |
main(String[] args)
|
int |
run(String[] args)
Runs the LinkDumper tool. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LOG
public static final org.apache.commons.logging.Log LOG
DUMP_DIR
public static final String DUMP_DIR
- See Also:
- Constant Field Values
LinkDumper
public LinkDumper()
dumpLinks
public void dumpLinks(Path webGraphDb)
throws IOException
- Runs the inverter and merger jobs of the LinkDumper tool to create the
url to inlink node database.
- Throws:
IOException
main
public static void main(String[] args)
throws Exception
- Throws:
Exception
run
public int run(String[] args)
throws Exception
- Runs the LinkDumper tool. This simply creates the database, to read the
values the nested Reader tool must be used.
- Specified by:
run
in interface Tool
- Throws:
Exception
Copyright © 2006 The Apache Software Foundation