Class Summary |
LinkDatum |
A class for holding link information including the url, anchor text, a score,
the timestamp of the link and a link type. |
LinkDumper |
The LinkDumper tool creates a database of node to inlink information that can
be read using the nested Reader class. |
LinkDumper.Inverter |
Inverts outlinks from the WebGraph to inlinks and attaches node
information. |
LinkDumper.LinkNode |
Bean class which holds url to node information. |
LinkDumper.LinkNodes |
Writable class which holds an array of LinkNode objects. |
LinkDumper.Merger |
Merges LinkNode objects into a single array value per url. |
LinkDumper.Reader |
Reader class which will print out the url and all of its inlinks to system
out. |
LinkRank |
|
LoopReader |
The LoopReader tool prints the loopset information for a single url. |
Loops |
The Loops job identifies cycles of loops inside of the web graph. |
Loops.Finalizer |
Finishes the Loops job by aggregating and collecting and found routes. |
Loops.Initializer |
Initializes the Loop routes. |
Loops.Looper |
Follows a route path looking for the start url of the route. |
Loops.LoopSet |
A set of loops. |
Loops.Route |
A link path or route looking to identify a link cycle. |
Node |
A class which holds the number of inlinks and outlinks for a given url along
with an inlink score from a link analysis program and any metadata. |
NodeDumper |
A tools that dumps out the top urls by number of inlinks, number of outlinks,
or by score, to a text file. |
NodeDumper.Sorter |
Outputs the top urls sorted in descending order. |
NodeReader |
Reads and prints to system out information for a single node from the NodeDb
in the WebGraph. |
ScoreUpdater |
Updates the score from the WebGraph node database into the crawl database. |
WebGraph |
Creates three databases, one for inlinks, one for outlinks, and a node
database that holds the number of in and outlinks to a url and the current
score for the url. |
WebGraph.OutlinkDb |
The OutlinkDb creates a database of all outlinks. |