org.apache.lucene.analysis.compound
Class HyphenationCompoundWordTokenFilterFactory
java.lang.Object
org.apache.lucene.analysis.util.AbstractAnalysisFactory
org.apache.lucene.analysis.util.TokenFilterFactory
org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilterFactory
- All Implemented Interfaces:
- ResourceLoaderAware
public class HyphenationCompoundWordTokenFilterFactory
- extends TokenFilterFactory
- implements ResourceLoaderAware
Factory for HyphenationCompoundWordTokenFilter
.
This factory accepts the following parameters:
hyphenator
(mandatory): path to the FOP xml hyphenation pattern.
See http://offo.sourceforge.net/hyphenation/.
encoding
(optional): encoding of the xml hyphenation file. defaults to UTF-8.
dictionary
(optional): dictionary of words. defaults to no dictionary.
minWordSize
(optional): minimal word length that gets decomposed. defaults to 5.
minSubwordSize
(optional): minimum length of subwords. defaults to 2.
maxSubwordSize
(optional): maximum length of subwords. defaults to 15.
onlyLongestMatch
(optional): if true, adds only the longest matching subword
to the stream. defaults to false.
<fieldType name="text_hyphncomp" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.HyphenationCompoundWordTokenFilterFactory" hyphenator="hyphenator.xml" encoding="UTF-8"
dictionary="dictionary.txt" minWordSize="5" minSubwordSize="2" maxSubwordSize="15" onlyLongestMatch="false"/>
</analyzer>
</fieldType>
- See Also:
HyphenationCompoundWordTokenFilter
Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory |
assureMatchVersion, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitFileNames |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
HyphenationCompoundWordTokenFilterFactory
public HyphenationCompoundWordTokenFilterFactory(Map<String,String> args)
- Creates a new HyphenationCompoundWordTokenFilterFactory
inform
public void inform(ResourceLoader loader)
throws IOException
- Description copied from interface:
ResourceLoaderAware
- Initializes this component with the provided ResourceLoader
(used for loading classes, files, etc).
- Specified by:
inform
in interface ResourceLoaderAware
- Throws:
IOException
create
public HyphenationCompoundWordTokenFilter create(TokenStream input)
- Description copied from class:
TokenFilterFactory
- Transform the specified input TokenStream
- Specified by:
create
in class TokenFilterFactory
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.