org.apache.nutch.indexer
Class IndexingFilters
java.lang.Object
org.apache.nutch.indexer.IndexingFilters
public class IndexingFilters
- extends Object
Creates and caches IndexingFilter
implementing plugins.
Constructor Summary |
IndexingFilters(org.apache.hadoop.conf.Configuration conf)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
INDEXINGFILTER_ORDER
public static final String INDEXINGFILTER_ORDER
- See Also:
- Constant Field Values
LOG
public static final org.slf4j.Logger LOG
IndexingFilters
public IndexingFilters(org.apache.hadoop.conf.Configuration conf)
filter
public NutchDocument filter(NutchDocument doc,
String url,
WebPage page)
throws IndexingException
- Run all defined filters.
- Throws:
IndexingException
getFields
public Collection<WebPage.Field> getFields()
- Gets all the fields for a given
WebPage
Many datastores need to setup the mapreduce job by specifying the fields
needed. All extensions that work on WebPage are able to specify what fields
they need.
Copyright © 2013 The Apache Software Foundation