[Missing <summary> documentation for "N:Lucene.Net.Analysis.Ru"]
Classes
Class | Description | |
---|---|---|
RussianAnalyzer |
Analyzer for Russian language. Supports an external list of stopwords (words that
will not be indexed at all).
A default set of stopwords is used unless an alternative list is specified.
| |
RussianCharsets |
RussianCharsets class contains encodings schemes (charsets) and ToLowerCase() method implementation
for russian characters in Unicode, KOI8 and CP1252.
Each encoding scheme contains lowercase (positions 0-31) and uppercase (position 32-63) characters.
One should be able to add other encoding schemes (like ISO-8859-5 or customized) by adding a new charset
and adding logic to ToLowerCase() method for that charset.
| |
RussianLetterTokenizer |
A RussianLetterTokenizer is a tokenizer that extends LetterTokenizer by additionally looking up letters
in a given "russian charset". The problem with LeterTokenizer is that it uses Character.isLetter() method,
which doesn't know how to detect letters in encodings like CP1252 and KOI8
(well-known problems with 0xD7 and 0xF7 chars)
| |
RussianLowerCaseFilter |
Normalizes token text to lower case, analyzing given ("russian") charset.
| |
RussianStemFilter |
A filter that stems Russian words. The implementation was inspired by GermanStemFilter.
The input should be filtered by RussianLowerCaseFilter before passing it to RussianStemFilter,
because RussianStemFilter only works with lowercase part of any "russian" charset.
| |
RussianStemmer |
Russian stemming algorithm implementation (see http://snowball.sourceforge.net for detailed description).
|