org.apache.nutch.scoring.webgraph
Class NodeDumper
java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.nutch.scoring.webgraph.NodeDumper
- All Implemented Interfaces:
- Configurable, Tool
public class NodeDumper
- extends Configured
- implements Tool
A tools that dumps out the top urls by number of inlinks, number of outlinks,
or by score, to a text file. One of the major uses of this tool is to check
the top scoring urls of a link analysis program such as LinkRank.
For number of inlinks or number of outlinks the WebGraph program will need to
have been run. For link analysis score a program such as LinkRank will need
to have been run which updates the NodeDb of the WebGraph.
Nested Class Summary |
static class |
NodeDumper.Sorter
Outputs the top urls sorted in descending order. |
Field Summary |
static org.apache.commons.logging.Log |
LOG
|
Method Summary |
void |
dumpNodes(Path webGraphDb,
org.apache.nutch.scoring.webgraph.NodeDumper.DumpType type,
long topN,
Path output)
Runs the process to dump the top urls out to a text file. |
static void |
main(String[] args)
|
int |
run(String[] args)
Runs the node dumper tool. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LOG
public static final org.apache.commons.logging.Log LOG
NodeDumper
public NodeDumper()
dumpNodes
public void dumpNodes(Path webGraphDb,
org.apache.nutch.scoring.webgraph.NodeDumper.DumpType type,
long topN,
Path output)
throws IOException
- Runs the process to dump the top urls out to a text file.
- Parameters:
webGraphDb
- The WebGraph from which to pull values.inlinks
- outlinks
- scores
- topN
- output
-
- Throws:
IOException
- If an error occurs while dumping the top values.
main
public static void main(String[] args)
throws Exception
- Throws:
Exception
run
public int run(String[] args)
throws Exception
- Runs the node dumper tool.
- Specified by:
run
in interface Tool
- Throws:
Exception
Copyright © 2006 The Apache Software Foundation