org.apache.nutch.indexer.basic
Class BasicIndexingFilter

java.lang.Object
  extended by org.apache.nutch.indexer.basic.BasicIndexingFilter
All Implemented Interfaces:
Configurable, IndexingFilter, Pluggable

public class BasicIndexingFilter
extends Object
implements IndexingFilter

Adds basic searchable fields to a document.


Field Summary
static org.apache.commons.logging.Log LOG
           
 
Fields inherited from interface org.apache.nutch.indexer.IndexingFilter
X_POINT_ID
 
Constructor Summary
BasicIndexingFilter()
           
 
Method Summary
 void addIndexBackendOptions(Configuration conf)
          Adds index-level configuraition options.
 NutchDocument filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)
          Adds fields or otherwise modifies the document that will be indexed for a parse.
 Configuration getConf()
           
 void setConf(Configuration conf)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG
Constructor Detail

BasicIndexingFilter

public BasicIndexingFilter()
Method Detail

filter

public NutchDocument filter(NutchDocument doc,
                            Parse parse,
                            Text url,
                            CrawlDatum datum,
                            Inlinks inlinks)
                     throws IndexingException
Description copied from interface: IndexingFilter
Adds fields or otherwise modifies the document that will be indexed for a parse. Unwanted documents can be removed from indexing by returning a null value.

Specified by:
filter in interface IndexingFilter
Parameters:
doc - document instance for collecting fields
parse - parse data instance
url - page url
datum - crawl datum for the page
inlinks - page inlinks
Returns:
modified (or a new) document instance, or null (meaning the document should be discarded)
Throws:
IndexingException

addIndexBackendOptions

public void addIndexBackendOptions(Configuration conf)
Description copied from interface: IndexingFilter
Adds index-level configuraition options. Implementations can update given configuration to pass document-independent information to indexing backends. As a rule of thumb, prefix meta keys with the name of the backend intended. For example, when passing information to lucene backend, prefix keys with "lucene.".

Specified by:
addIndexBackendOptions in interface IndexingFilter
Parameters:
conf - Configuration instance.

setConf

public void setConf(Configuration conf)
Specified by:
setConf in interface Configurable

getConf

public Configuration getConf()
Specified by:
getConf in interface Configurable


Copyright © 2006 The Apache Software Foundation