|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use ParseData | |
---|---|
org.apache.nutch.crawl | Crawl control code. |
org.apache.nutch.parse | |
org.apache.nutch.scoring | |
org.apache.nutch.scoring.link | |
org.apache.nutch.scoring.opic | |
org.apache.nutch.scoring.tld | Top Level Domain Scoring plugin. |
org.apache.nutch.scoring.urlmeta | URL Meta Tag Scoring Plugin |
org.apache.nutch.segment |
Uses of ParseData in org.apache.nutch.crawl |
---|
Methods in org.apache.nutch.crawl with parameters of type ParseData | |
---|---|
void |
LinkDb.map(Text key,
ParseData parseData,
OutputCollector<Text,Inlinks> output,
Reporter reporter)
|
Uses of ParseData in org.apache.nutch.parse |
---|
Methods in org.apache.nutch.parse that return ParseData | |
---|---|
ParseData |
Parse.getData()
Other data extracted from the page. |
ParseData |
ParseImpl.getData()
|
static ParseData |
ParseData.read(DataInput in)
|
Methods in org.apache.nutch.parse with parameters of type ParseData | |
---|---|
void |
ParseResult.put(String key,
ParseText text,
ParseData data)
Store a result of parsing. |
void |
ParseResult.put(Text key,
ParseText text,
ParseData data)
Store a result of parsing. |
Constructors in org.apache.nutch.parse with parameters of type ParseData | |
---|---|
ParseImpl(ParseText text,
ParseData data)
|
|
ParseImpl(ParseText text,
ParseData data,
boolean isCanonical)
|
|
ParseImpl(String text,
ParseData data)
|
Uses of ParseData in org.apache.nutch.scoring |
---|
Methods in org.apache.nutch.scoring with parameters of type ParseData | |
---|---|
CrawlDatum |
ScoringFilter.distributeScoreToOutlinks(Text fromUrl,
ParseData parseData,
Collection<Map.Entry<Text,CrawlDatum>> targets,
CrawlDatum adjust,
int allCount)
Distribute score value from the current page to all its outlinked pages. |
CrawlDatum |
ScoringFilters.distributeScoreToOutlinks(Text fromUrl,
ParseData parseData,
Collection<Map.Entry<Text,CrawlDatum>> targets,
CrawlDatum adjust,
int allCount)
|
Uses of ParseData in org.apache.nutch.scoring.link |
---|
Methods in org.apache.nutch.scoring.link with parameters of type ParseData | |
---|---|
CrawlDatum |
LinkAnalysisScoringFilter.distributeScoreToOutlinks(Text fromUrl,
ParseData parseData,
Collection<Map.Entry<Text,CrawlDatum>> targets,
CrawlDatum adjust,
int allCount)
|
Uses of ParseData in org.apache.nutch.scoring.opic |
---|
Methods in org.apache.nutch.scoring.opic with parameters of type ParseData | |
---|---|
CrawlDatum |
OPICScoringFilter.distributeScoreToOutlinks(Text fromUrl,
ParseData parseData,
Collection<Map.Entry<Text,CrawlDatum>> targets,
CrawlDatum adjust,
int allCount)
Get a float value from Fetcher.SCORE_KEY, divide it by the number of outlinks and apply. |
Uses of ParseData in org.apache.nutch.scoring.tld |
---|
Methods in org.apache.nutch.scoring.tld with parameters of type ParseData | |
---|---|
CrawlDatum |
TLDScoringFilter.distributeScoreToOutlink(Text fromUrl,
Text toUrl,
ParseData parseData,
CrawlDatum target,
CrawlDatum adjust,
int allCount,
int validCount)
|
CrawlDatum |
TLDScoringFilter.distributeScoreToOutlinks(Text fromUrl,
ParseData parseData,
Collection<Map.Entry<Text,CrawlDatum>> targets,
CrawlDatum adjust,
int allCount)
|
Uses of ParseData in org.apache.nutch.scoring.urlmeta |
---|
Methods in org.apache.nutch.scoring.urlmeta with parameters of type ParseData | |
---|---|
CrawlDatum |
URLMetaScoringFilter.distributeScoreToOutlinks(Text fromUrl,
ParseData parseData,
Collection<Map.Entry<Text,CrawlDatum>> targets,
CrawlDatum adjust,
int allCount)
This will take the metatags that you have listed in your "urlmeta.tags" property, and looks for them inside the parseData object. |
Uses of ParseData in org.apache.nutch.segment |
---|
Methods in org.apache.nutch.segment with parameters of type ParseData | |
---|---|
boolean |
SegmentMergeFilters.filter(WritableComparable key,
CrawlDatum generateData,
CrawlDatum fetchData,
CrawlDatum sigData,
Content content,
ParseData parseData,
ParseText parseText,
Collection<CrawlDatum> linked)
Iterates over all SegmentMergeFilter extensions and if any of them
returns false, it will return false as well. |
boolean |
SegmentMergeFilter.filter(WritableComparable key,
CrawlDatum generateData,
CrawlDatum fetchData,
CrawlDatum sigData,
Content content,
ParseData parseData,
ParseText parseText,
Collection<CrawlDatum> linked)
The filtering method which gets all information being merged for a given key (URL). |
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |