org.apache.nutch.scoring.webgraph
Class LinkDumper

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.nutch.scoring.webgraph.LinkDumper
All Implemented Interfaces:
Configurable, Tool

public class LinkDumper
extends Configured
implements Tool

The LinkDumper tool creates a database of node to inlink information that can be read using the nested Reader class. This allows the inlink and scoring state of a single url to be reviewed quickly to determine why a given url is ranking a certain way. This tool is to be used with the LinkRank analysis.


Nested Class Summary
static class LinkDumper.Inverter
          Inverts outlinks from the WebGraph to inlinks and attaches node information.
static class LinkDumper.LinkNode
          Bean class which holds url to node information.
static class LinkDumper.LinkNodes
          Writable class which holds an array of LinkNode objects.
static class LinkDumper.Merger
          Merges LinkNode objects into a single array value per url.
static class LinkDumper.Reader
          Reader class which will print out the url and all of its inlinks to system out.
 
Field Summary
static String DUMP_DIR
           
static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
LinkDumper()
           
 
Method Summary
 void dumpLinks(Path webGraphDb)
          Runs the inverter and merger jobs of the LinkDumper tool to create the url to inlink node database.
static void main(String[] args)
           
 int run(String[] args)
          Runs the LinkDumper tool.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

DUMP_DIR

public static final String DUMP_DIR
See Also:
Constant Field Values
Constructor Detail

LinkDumper

public LinkDumper()
Method Detail

dumpLinks

public void dumpLinks(Path webGraphDb)
               throws IOException
Runs the inverter and merger jobs of the LinkDumper tool to create the url to inlink node database.

Throws:
IOException

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception

run

public int run(String[] args)
        throws Exception
Runs the LinkDumper tool. This simply creates the database, to read the values the nested Reader tool must be used.

Specified by:
run in interface Tool
Throws:
Exception


Copyright © 2006 The Apache Software Foundation