org.apache.nutch.analysis
Class NutchAnalyzer

java.lang.Object
  extended by org.apache.lucene.analysis.Analyzer
      extended by org.apache.nutch.analysis.NutchAnalyzer
All Implemented Interfaces:
Closeable, Configurable, Pluggable
Direct Known Subclasses:
NutchDocumentAnalyzer

public abstract class NutchAnalyzer
extends Analyzer
implements Configurable, Pluggable

Extension point for analysis. All plugins found which implement this extension point are run sequentially on the parse.

Author:
Jérôme Charron

Field Summary
protected  Configuration conf
          The current Configuration
 
Fields inherited from class org.apache.lucene.analysis.Analyzer
overridesTokenStreamMethod
 
Constructor Summary
NutchAnalyzer()
           
 
Method Summary
 Configuration getConf()
           
 void setConf(Configuration conf)
           
abstract  TokenStream tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 
Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getOffsetGap, getPositionIncrementGap, getPreviousTokenStream, reusableTokenStream, setOverridesTokenStreamMethod, setPreviousTokenStream
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

conf

protected Configuration conf
The current Configuration

Constructor Detail

NutchAnalyzer

public NutchAnalyzer()
Method Detail

tokenStream

public abstract TokenStream tokenStream(String fieldName,
                                        Reader reader)
Creates a TokenStream which tokenizes all the text in the provided Reader.

Specified by:
tokenStream in class Analyzer

setConf

public void setConf(Configuration conf)
Specified by:
setConf in interface Configurable

getConf

public Configuration getConf()
Specified by:
getConf in interface Configurable


Copyright © 2006 The Apache Software Foundation