public interface IndexingFilter extends Pluggable, Configurable
Modifier and Type | Field and Description |
---|---|
static String |
X_POINT_ID
The name of the extension point.
|
Modifier and Type | Method and Description |
---|---|
NutchDocument |
filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Adds fields or otherwise modifies the document that will be indexed for a
parse.
|
getConf, setConf
static final String X_POINT_ID
NutchDocument filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks) throws IndexingException
doc
- document instance for collecting fieldsparse
- parse data instanceurl
- page urldatum
- crawl datum for the page (fetch datum from segment containing
fetch status and fetch time)inlinks
- page inlinksIndexingException
Copyright © 2015 The Apache Software Foundation