org.apache.nutch.indexer.field
Interface FieldFilter
- All Superinterfaces:
- Configurable, Pluggable
public interface FieldFilter
- extends Pluggable, Configurable
Filter to manipulate FieldWritable objects for a given url during indexing.
Field filters are responsible for converting FieldWritable objects into
lucene fields and adding those fields to the Lucene document.
X_POINT_ID
static final String X_POINT_ID
filter
Document filter(String url,
Document doc,
List<FieldWritable> fields)
throws IndexingException
- Returns the document to which fields are being added or null if we are to
stop processing for this url and not add anything to the index. All
FieldWritable objects for a url are aggregated from databases passed into
the FieldIndexer and these fields are then passed into the Field filters.
It is therefore possible for fields to be added, removed, and changed
before being indexed.
- Parameters:
url
- The url to index.doc
- The lucene documentfields
- The list of FieldWritable objects representing fields for
the index.
- Returns:
- The lucene Document or null to stop processing and not index any
content for this url.
- Throws:
IndexingException
- If an error occurs during indexing
Copyright © 2006 The Apache Software Foundation