Class | Description |
---|---|
LinkDatum |
A class for holding link information including the url, anchor text, a score,
the timestamp of the link and a link type.
|
LinkDumper |
The LinkDumper tool creates a database of node to inlink information that can
be read using the nested Reader class.
|
LinkDumper.Inverter |
Inverts outlinks from the WebGraph to inlinks and attaches node
information.
|
LinkDumper.LinkNode |
Bean class which holds url to node information.
|
LinkDumper.LinkNodes |
Writable class which holds an array of LinkNode objects.
|
LinkDumper.Merger |
Merges LinkNode objects into a single array value per url.
|
LinkDumper.Reader |
Reader class which will print out the url and all of its inlinks to system
out.
|
LinkRank | |
LoopReader |
The LoopReader tool prints the loopset information for a single url.
|
Loops |
The Loops job identifies cycles of loops inside of the web graph.
|
Loops.Finalizer |
Finishes the Loops job by aggregating and collecting and found routes.
|
Loops.Initializer |
Initializes the Loop routes.
|
Loops.Looper |
Follows a route path looking for the start url of the route.
|
Loops.LoopSet |
A set of loops.
|
Loops.Route |
A link path or route looking to identify a link cycle.
|
Node |
A class which holds the number of inlinks and outlinks for a given url along
with an inlink score from a link analysis program and any metadata.
|
NodeDumper |
A tools that dumps out the top urls by number of inlinks, number of outlinks,
or by score, to a text file.
|
NodeDumper.Dumper |
Outputs the hosts or domains with an associated value.
|
NodeDumper.Sorter |
Outputs the top urls sorted in descending order.
|
NodeReader |
Reads and prints to system out information for a single node from the NodeDb
in the WebGraph.
|
ScoreUpdater |
Updates the score from the WebGraph node database into the crawl database.
|
WebGraph |
Creates three databases, one for inlinks, one for outlinks, and a node
database that holds the number of in and outlinks to a url and the current
score for the url.
|
WebGraph.OutlinkDb |
The OutlinkDb creates a database of all outlinks.
|
Copyright © 2014 The Apache Software Foundation