org.apache.nutch.scoring.webgraph
Class NodeDumper

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.nutch.scoring.webgraph.NodeDumper
All Implemented Interfaces:
Configurable, Tool

public class NodeDumper
extends Configured
implements Tool

A tools that dumps out the top urls by number of inlinks, number of outlinks, or by score, to a text file. One of the major uses of this tool is to check the top scoring urls of a link analysis program such as LinkRank. For number of inlinks or number of outlinks the WebGraph program will need to have been run. For link analysis score a program such as LinkRank will need to have been run which updates the NodeDb of the WebGraph.


Nested Class Summary
static class NodeDumper.Sorter
          Outputs the top urls sorted in descending order.
 
Field Summary
static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
NodeDumper()
           
 
Method Summary
 void dumpNodes(Path webGraphDb, org.apache.nutch.scoring.webgraph.NodeDumper.DumpType type, long topN, Path output)
          Runs the process to dump the top urls out to a text file.
static void main(String[] args)
           
 int run(String[] args)
          Runs the node dumper tool.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG
Constructor Detail

NodeDumper

public NodeDumper()
Method Detail

dumpNodes

public void dumpNodes(Path webGraphDb,
                      org.apache.nutch.scoring.webgraph.NodeDumper.DumpType type,
                      long topN,
                      Path output)
               throws IOException
Runs the process to dump the top urls out to a text file.

Parameters:
webGraphDb - The WebGraph from which to pull values.
topN -
output -
Throws:
IOException - If an error occurs while dumping the top values.

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception

run

public int run(String[] args)
        throws Exception
Runs the node dumper tool.

Specified by:
run in interface Tool
Throws:
Exception


Copyright © 2006 The Apache Software Foundation