|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use NutchDocument | |
---|---|
org.apache.nutch.analysis.lang | Text document language identifier. |
org.apache.nutch.indexer | Maintain Lucene full-text indexes. |
org.apache.nutch.indexer.basic | A basic indexing plugin. |
org.apache.nutch.indexer.more | A more indexing plugin. |
org.apache.nutch.indexer.solr | |
org.apache.nutch.microformats.reltag | A microformats Rel-Tag Parser/Indexer/Querier plugin. |
org.apache.nutch.scoring | |
org.apache.nutch.scoring.opic | |
org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
Uses of NutchDocument in org.apache.nutch.analysis.lang |
---|
Methods in org.apache.nutch.analysis.lang that return NutchDocument | |
---|---|
NutchDocument |
LanguageIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Methods in org.apache.nutch.analysis.lang with parameters of type NutchDocument | |
---|---|
NutchDocument |
LanguageIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of NutchDocument in org.apache.nutch.indexer |
---|
Methods in org.apache.nutch.indexer that return NutchDocument | |
---|---|
NutchDocument |
IndexingFilters.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Run all defined filters. |
NutchDocument |
IndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Adds fields or otherwise modifies the document that will be indexed for a parse. |
Methods in org.apache.nutch.indexer that return types with arguments of type NutchDocument | |
---|---|
RecordWriter<Text,NutchDocument> |
IndexerOutputFormat.getRecordWriter(FileSystem ignored,
JobConf job,
String name,
Progressable progress)
|
Methods in org.apache.nutch.indexer with parameters of type NutchDocument | |
---|---|
NutchDocument |
IndexingFilters.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Run all defined filters. |
NutchDocument |
IndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Adds fields or otherwise modifies the document that will be indexed for a parse. |
void |
NutchIndexWriter.write(NutchDocument doc)
|
Method parameters in org.apache.nutch.indexer with type arguments of type NutchDocument | |
---|---|
void |
IndexerMapReduce.reduce(Text key,
Iterator<NutchWritable> values,
OutputCollector<Text,NutchDocument> output,
Reporter reporter)
|
Uses of NutchDocument in org.apache.nutch.indexer.basic |
---|
Methods in org.apache.nutch.indexer.basic that return NutchDocument | |
---|---|
NutchDocument |
BasicIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Methods in org.apache.nutch.indexer.basic with parameters of type NutchDocument | |
---|---|
NutchDocument |
BasicIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of NutchDocument in org.apache.nutch.indexer.more |
---|
Methods in org.apache.nutch.indexer.more that return NutchDocument | |
---|---|
NutchDocument |
MoreIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Methods in org.apache.nutch.indexer.more with parameters of type NutchDocument | |
---|---|
NutchDocument |
MoreIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of NutchDocument in org.apache.nutch.indexer.solr |
---|
Methods in org.apache.nutch.indexer.solr with parameters of type NutchDocument | |
---|---|
void |
SolrWriter.write(NutchDocument doc)
|
Uses of NutchDocument in org.apache.nutch.microformats.reltag |
---|
Methods in org.apache.nutch.microformats.reltag that return NutchDocument | |
---|---|
NutchDocument |
RelTagIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Methods in org.apache.nutch.microformats.reltag with parameters of type NutchDocument | |
---|---|
NutchDocument |
RelTagIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of NutchDocument in org.apache.nutch.scoring |
---|
Methods in org.apache.nutch.scoring with parameters of type NutchDocument | |
---|---|
float |
ScoringFilters.indexerScore(Text url,
NutchDocument doc,
CrawlDatum dbDatum,
CrawlDatum fetchDatum,
Parse parse,
Inlinks inlinks,
float initScore)
|
float |
ScoringFilter.indexerScore(Text url,
NutchDocument doc,
CrawlDatum dbDatum,
CrawlDatum fetchDatum,
Parse parse,
Inlinks inlinks,
float initScore)
This method calculates a Lucene document boost. |
Uses of NutchDocument in org.apache.nutch.scoring.opic |
---|
Methods in org.apache.nutch.scoring.opic with parameters of type NutchDocument | |
---|---|
float |
OPICScoringFilter.indexerScore(Text url,
NutchDocument doc,
CrawlDatum dbDatum,
CrawlDatum fetchDatum,
Parse parse,
Inlinks inlinks,
float initScore)
Dampen the boost value by scorePower. |
Uses of NutchDocument in org.creativecommons.nutch |
---|
Methods in org.creativecommons.nutch that return NutchDocument | |
---|---|
NutchDocument |
CCIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Methods in org.creativecommons.nutch with parameters of type NutchDocument | |
---|---|
void |
CCIndexingFilter.addUrlFeatures(NutchDocument doc,
String urlString)
Add the features represented by a license URL. |
NutchDocument |
CCIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |