org.apache.nutch.scoring.webgraph
Class LoopReader

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.nutch.scoring.webgraph.LoopReader
All Implemented Interfaces:
Configurable

public class LoopReader
extends Configured

The LoopReader tool prints the loopset information for a single url.


Constructor Summary
LoopReader()
           
LoopReader(Configuration conf)
           
 
Method Summary
 void dumpUrl(Path webGraphDb, String url)
          Prints loopset for a single url.
static void main(String[] args)
          Runs the LoopReader tool.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LoopReader

public LoopReader()

LoopReader

public LoopReader(Configuration conf)
Method Detail

dumpUrl

public void dumpUrl(Path webGraphDb,
                    String url)
             throws IOException
Prints loopset for a single url. The loopset information will show any outlink url the eventually forms a link cycle.

Parameters:
webGraphDb - The WebGraph to check for loops
url - The url to check.
Throws:
IOException - If an error occurs while printing loopset information.

main

public static void main(String[] args)
                 throws Exception
Runs the LoopReader tool. For this tool to work the loops job must have already been run on the corresponding WebGraph.

Throws:
Exception


Copyright © 2011 The Apache Software Foundation