Lucene.Net
3.0.3
Lucene.Net is a .NET port of the Java Lucene Indexing Library
|
Analyzer for German language. Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, the exclusion list is empty by default. More...
Inherits Lucene.Net.Analysis.Analyzer.
Public Member Functions | |
GermanAnalyzer () | |
Builds an analyzer with the default stop words: GetDefaultStopSet | |
GermanAnalyzer (Version matchVersion) | |
Builds an analyzer with the default stop words: GetDefaultStopSet | |
GermanAnalyzer (Version matchVersion, bool normalizeDin2) | |
Builds an analyzer with the default stop words: GetDefaultStopSet | |
GermanAnalyzer (Version matchVersion, ISet< string > stopwords) | |
Builds an analyzer with the given stop words, using the default DIN-5007-1 stemmer | |
GermanAnalyzer (Version matchVersion, ISet< string > stopwords, bool normalizeDin2) | |
Builds an analyzer with the given stop words | |
GermanAnalyzer (Version matchVersion, ISet< string > stopwords, ISet< string > stemExclusionSet) | |
Builds an analyzer with the given stop words, using the default DIN-5007-1 stemmer | |
GermanAnalyzer (Version matchVersion, ISet< string > stopwords, ISet< string > stemExclusionSet, bool normalizeDin2) | |
Builds an analyzer with the given stop words | |
GermanAnalyzer (Version matchVersion, params string[] stopwords) | |
Builds an analyzer with the given stop words. | |
GermanAnalyzer (Version matchVersion, IDictionary< string, string > stopwords) | |
Builds an analyzer with the given stop words. | |
GermanAnalyzer (Version matchVersion, FileInfo stopwords) | |
Builds an analyzer with the given stop words. | |
void | SetStemExclusionTable (String[] exclusionlist) |
Builds an exclusionlist from an array of Strings. | |
void | SetStemExclusionTable (IDictionary< string, string > exclusionlist) |
Builds an exclusionlist from a IDictionary. | |
void | SetStemExclusionTable (FileInfo exclusionlist) |
Builds an exclusionlist from the words contained in the given file. | |
override TokenStream | TokenStream (String fieldName, TextReader reader) |
Creates a TokenStream which tokenizes all the text in the provided TextReader. | |
Public Member Functions inherited from Lucene.Net.Analysis.Analyzer | |
abstract TokenStream | TokenStream (String fieldName, System.IO.TextReader reader) |
Creates a TokenStream which tokenizes all the text in the provided Reader. Must be able to handle null field name for backward compatibility. | |
virtual TokenStream | ReusableTokenStream (String fieldName, System.IO.TextReader reader) |
Creates a TokenStream that is allowed to be re-used from the previous time that the same thread called this method. Callers that do not need to use more than one TokenStream at the same time from this analyzer should use this method for better performance. | |
virtual int | GetPositionIncrementGap (String fieldName) |
Invoked before indexing a Fieldable instance if terms have already been added to that field. This allows custom analyzers to place an automatic position increment gap between Fieldable instances using the same field name. The default value position increment gap is 0. With a 0 position increment gap and the typical default token position increment of 1, all terms in a field, including across Fieldable instances, are in successive positions, allowing exact PhraseQuery matches, for instance, across Fieldable instance boundaries. | |
virtual int | GetOffsetGap (IFieldable field) |
Just like GetPositionIncrementGap, except for Token offsets instead. By default this returns 1 for tokenized fields and, as if the fields were joined with an extra space character, and 0 for un-tokenized fields. This method is only called if the field produced at least one token for indexing. | |
void | Close () |
Frees persistent resources used by this Analyzer | |
virtual void | Dispose () |
Static Public Member Functions | |
static ISet< string > | GetDefaultStopSet () |
Returns a set of default German-stopwords | |
Additional Inherited Members | |
Protected Member Functions inherited from Lucene.Net.Analysis.Analyzer | |
virtual void | Dispose (bool disposing) |
Analyzer for German language. Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, the exclusion list is empty by default.
Definition at line 40 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | ) |
Builds an analyzer with the default stop words: GetDefaultStopSet
Definition at line 97 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion | ) |
Builds an analyzer with the default stop words: GetDefaultStopSet
matchVersion | Lucene compatibility version |
Definition at line 107 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion, |
bool | normalizeDin2 | ||
) |
Builds an analyzer with the default stop words: GetDefaultStopSet
matchVersion | Lucene compatibility version |
normalizeDin2 | Specifies if the DIN-2007-2 style stemmer should be used in addition to DIN1. This will cause words with 'ae', 'ue', or 'oe' in them (expanded umlauts) to be first converted to 'a', 'u', and 'o' respectively, before the DIN1 stemmer is invoked. |
Definition at line 119 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion, |
ISet< string > | stopwords | ||
) |
Builds an analyzer with the given stop words, using the default DIN-5007-1 stemmer
matchVersion | Lucene compatibility version |
stopwords | a stopword set |
Definition at line 128 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion, |
ISet< string > | stopwords, | ||
bool | normalizeDin2 | ||
) |
Builds an analyzer with the given stop words
matchVersion | Lucene compatibility version |
stopwords | a stopword set |
normalizeDin2 | Specifies if the DIN-2007-2 style stemmer should be used in addition to DIN1. This will cause words with 'ae', 'ue', or 'oe' in them (expanded umlauts) to be first converted to 'a', 'u', and 'o' respectively, before the DIN1 stemmer is invoked. |
Definition at line 141 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion, |
ISet< string > | stopwords, | ||
ISet< string > | stemExclusionSet | ||
) |
Builds an analyzer with the given stop words, using the default DIN-5007-1 stemmer
matchVersion | lucene compatibility version |
stopwords | a stopword set |
stemExclusionSet | a stemming exclusion set |
Definition at line 152 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion, |
ISet< string > | stopwords, | ||
ISet< string > | stemExclusionSet, | ||
bool | normalizeDin2 | ||
) |
Builds an analyzer with the given stop words
matchVersion | lucene compatibility version |
stopwords | a stopword set |
stemExclusionSet | a stemming exclusion set |
normalizeDin2 | Specifies if the DIN-2007-2 style stemmer should be used in addition to DIN1. This will cause words with 'ae', 'ue', or 'oe' in them (expanded umlauts) to be first converted to 'a', 'u', and 'o' respectively, before the DIN1 stemmer is invoked. |
Definition at line 166 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion, |
params string[] | stopwords | ||
) |
Builds an analyzer with the given stop words.
stopwords |
Definition at line 180 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion, |
IDictionary< string, string > | stopwords | ||
) |
Builds an analyzer with the given stop words.
Definition at line 189 of file GermanAnalyzer.cs.
Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer | ( | Version | matchVersion, |
FileInfo | stopwords | ||
) |
Builds an analyzer with the given stop words.
Definition at line 199 of file GermanAnalyzer.cs.
|
static |
Returns a set of default German-stopwords
Definition at line 65 of file GermanAnalyzer.cs.
void Lucene.Net.Analysis.De.GermanAnalyzer.SetStemExclusionTable | ( | String[] | exclusionlist | ) |
Builds an exclusionlist from an array of Strings.
Definition at line 208 of file GermanAnalyzer.cs.
void Lucene.Net.Analysis.De.GermanAnalyzer.SetStemExclusionTable | ( | IDictionary< string, string > | exclusionlist | ) |
Builds an exclusionlist from a IDictionary.
Definition at line 218 of file GermanAnalyzer.cs.
void Lucene.Net.Analysis.De.GermanAnalyzer.SetStemExclusionTable | ( | FileInfo | exclusionlist | ) |
Builds an exclusionlist from the words contained in the given file.
Definition at line 228 of file GermanAnalyzer.cs.
override TokenStream Lucene.Net.Analysis.De.GermanAnalyzer.TokenStream | ( | String | fieldName, |
TextReader | reader | ||
) |
Creates a TokenStream which tokenizes all the text in the provided TextReader.
fieldName | |
reader |
Definition at line 240 of file GermanAnalyzer.cs.