org.apache.nutch.indexer
Class IndexingFilters

java.lang.Object
  extended by org.apache.nutch.indexer.IndexingFilters

public class IndexingFilters
extends Object

Creates and caches IndexingFilter implementing plugins.


Field Summary
static String INDEXINGFILTER_ORDER
           
static org.slf4j.Logger LOG
           
 
Constructor Summary
IndexingFilters(org.apache.hadoop.conf.Configuration conf)
           
 
Method Summary
 NutchDocument filter(NutchDocument doc, String url, WebPage page)
          Run all defined filters.
 Collection<WebPage.Field> getFields()
          Gets all the fields for a given WebPage Many datastores need to setup the mapreduce job by specifying the fields needed.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INDEXINGFILTER_ORDER

public static final String INDEXINGFILTER_ORDER
See Also:
Constant Field Values

LOG

public static final org.slf4j.Logger LOG
Constructor Detail

IndexingFilters

public IndexingFilters(org.apache.hadoop.conf.Configuration conf)
Method Detail

filter

public NutchDocument filter(NutchDocument doc,
                            String url,
                            WebPage page)
                     throws IndexingException
Run all defined filters.

Throws:
IndexingException

getFields

public Collection<WebPage.Field> getFields()
Gets all the fields for a given WebPage Many datastores need to setup the mapreduce job by specifying the fields needed. All extensions that work on WebPage are able to specify what fields they need.



Copyright © 2013 The Apache Software Foundation