Package org.apache.nutch.scoring.webgraph

Class Summary
LinkDatum A class for holding link information including the url, anchor text, a score, the timestamp of the link and a link type.
LinkDumper The LinkDumper tool creates a database of node to inlink information that can be read using the nested Reader class.
LinkDumper.Inverter Inverts outlinks from the WebGraph to inlinks and attaches node information.
LinkDumper.LinkNode Bean class which holds url to node information.
LinkDumper.LinkNodes Writable class which holds an array of LinkNode objects.
LinkDumper.Merger Merges LinkNode objects into a single array value per url.
LinkDumper.Reader Reader class which will print out the url and all of its inlinks to system out.
LinkRank  
LoopReader The LoopReader tool prints the loopset information for a single url.
Loops The Loops job identifies cycles of loops inside of the web graph.
Loops.Finalizer Finishes the Loops job by aggregating and collecting and found routes.
Loops.Initializer Initializes the Loop routes.
Loops.Looper Follows a route path looking for the start url of the route.
Loops.LoopSet A set of loops.
Loops.Route A link path or route looking to identify a link cycle.
Node A class which holds the number of inlinks and outlinks for a given url along with an inlink score from a link analysis program and any metadata.
NodeDumper A tools that dumps out the top urls by number of inlinks, number of outlinks, or by score, to a text file.
NodeDumper.Dumper Outputs the hosts or domains with an associated value.
NodeDumper.Sorter Outputs the top urls sorted in descending order.
NodeReader Reads and prints to system out information for a single node from the NodeDb in the WebGraph.
ScoreUpdater Updates the score from the WebGraph node database into the crawl database.
WebGraph Creates three databases, one for inlinks, one for outlinks, and a node database that holds the number of in and outlinks to a url and the current score for the url.
WebGraph.OutlinkDb The OutlinkDb creates a database of all outlinks.
 



Copyright © 2012 The Apache Software Foundation