Class CapitalizationFilterFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenFilterFactory
org.apache.lucene.analysis.miscellaneous.CapitalizationFilterFactory
Factory for
CapitalizationFilter
.
The factory takes parameters:
- "onlyFirstWord" - should each word be capitalized or all of the words?
- "keep" - a keep word list. Each word that should be kept separated by whitespace.
- "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive.
- "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list
- "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley"
- "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or"
- "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct.
<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true" keep="java solr lucene" keepIgnoreCase="false" okPrefix="McK McD McA"/> </analyzer> </fieldType>
- Since:
- solr 1.3
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "capitalization"
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
static final String
static final String
static final String
static final String
static final String
static final String
SPI namestatic final String
static final String
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
Constructor Summary
ConstructorDescriptionDefault ctor for compatibility with SPICreates a new CapitalizationFilterFactory -
Method Summary
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
KEEP
- See Also:
-
KEEP_IGNORE_CASE
- See Also:
-
OK_PREFIX
- See Also:
-
MIN_WORD_LENGTH
- See Also:
-
MAX_WORD_COUNT
- See Also:
-
MAX_TOKEN_LENGTH
- See Also:
-
ONLY_FIRST_WORD
- See Also:
-
FORCE_FIRST_LETTER
- See Also:
-
-
Constructor Details
-
CapitalizationFilterFactory
Creates a new CapitalizationFilterFactory -
CapitalizationFilterFactory
public CapitalizationFilterFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
create
- Specified by:
create
in classTokenFilterFactory
-