Analyzer for German language. Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, the exclusion list is empty by default. More...

Inherits Lucene.Net.Analysis.Analyzer.

Public Member Functions
	GermanAnalyzer ()
	Builds an analyzer with the default stop words: GetDefaultStopSet

	GermanAnalyzer (Version matchVersion)
	Builds an analyzer with the default stop words: GetDefaultStopSet

	GermanAnalyzer (Version matchVersion, bool normalizeDin2)
	Builds an analyzer with the default stop words: GetDefaultStopSet

	GermanAnalyzer (Version matchVersion, ISet< string > stopwords)
	Builds an analyzer with the given stop words, using the default DIN-5007-1 stemmer

	GermanAnalyzer (Version matchVersion, ISet< string > stopwords, bool normalizeDin2)
	Builds an analyzer with the given stop words

	GermanAnalyzer (Version matchVersion, ISet< string > stopwords, ISet< string > stemExclusionSet)
	Builds an analyzer with the given stop words, using the default DIN-5007-1 stemmer

	GermanAnalyzer (Version matchVersion, ISet< string > stopwords, ISet< string > stemExclusionSet, bool normalizeDin2)
	Builds an analyzer with the given stop words

	GermanAnalyzer (Version matchVersion, params string[] stopwords)
	Builds an analyzer with the given stop words.

	GermanAnalyzer (Version matchVersion, IDictionary< string, string > stopwords)
	Builds an analyzer with the given stop words.

	GermanAnalyzer (Version matchVersion, FileInfo stopwords)
	Builds an analyzer with the given stop words.

void	SetStemExclusionTable (String[] exclusionlist)
	Builds an exclusionlist from an array of Strings.

void	SetStemExclusionTable (IDictionary< string, string > exclusionlist)
	Builds an exclusionlist from a IDictionary.

void	SetStemExclusionTable (FileInfo exclusionlist)
	Builds an exclusionlist from the words contained in the given file.

override TokenStream	TokenStream (String fieldName, TextReader reader)
	Creates a TokenStream which tokenizes all the text in the provided TextReader.

Public Member Functions inherited from Lucene.Net.Analysis.Analyzer
abstract TokenStream	TokenStream (String fieldName, System.IO.TextReader reader)
	Creates a TokenStream which tokenizes all the text in the provided Reader. Must be able to handle null field name for backward compatibility.

virtual TokenStream	ReusableTokenStream (String fieldName, System.IO.TextReader reader)
	Creates a TokenStream that is allowed to be re-used from the previous time that the same thread called this method. Callers that do not need to use more than one TokenStream at the same time from this analyzer should use this method for better performance.

virtual int	GetPositionIncrementGap (String fieldName)
	Invoked before indexing a Fieldable instance if terms have already been added to that field. This allows custom analyzers to place an automatic position increment gap between Fieldable instances using the same field name. The default value position increment gap is 0. With a 0 position increment gap and the typical default token position increment of 1, all terms in a field, including across Fieldable instances, are in successive positions, allowing exact PhraseQuery matches, for instance, across Fieldable instance boundaries.

virtual int	GetOffsetGap (IFieldable field)
	Just like GetPositionIncrementGap, except for Token offsets instead. By default this returns 1 for tokenized fields and, as if the fields were joined with an extra space character, and 0 for un-tokenized fields. This method is only called if the field produced at least one token for indexing.

void	Close ()
	Frees persistent resources used by this Analyzer

virtual void	Dispose ()

Static Public Member Functions
static ISet< string >	GetDefaultStopSet ()
	Returns a set of default German-stopwords

Additional Inherited Members
Protected Member Functions inherited from Lucene.Net.Analysis.Analyzer
virtual void	Dispose (bool disposing)

Detailed Description

Analyzer for German language. Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, the exclusion list is empty by default.

Definition at line 40 of file GermanAnalyzer.cs.

Constructor & Destructor Documentation

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer ( )

Builds an analyzer with the default stop words: GetDefaultStopSet

Definition at line 97 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer ( Version matchVersion )

Builds an analyzer with the default stop words: GetDefaultStopSet

Parameters

matchVersion Lucene compatibility version

Definition at line 107 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer	(	Version	matchVersion,
		bool	normalizeDin2
	)

Builds an analyzer with the default stop words: GetDefaultStopSet

Parameters

matchVersion	Lucene compatibility version
normalizeDin2	Specifies if the DIN-2007-2 style stemmer should be used in addition to DIN1. This will cause words with 'ae', 'ue', or 'oe' in them (expanded umlauts) to be first converted to 'a', 'u', and 'o' respectively, before the DIN1 stemmer is invoked.

Definition at line 119 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer	(	Version	matchVersion,
		ISet< string >	stopwords
	)

Builds an analyzer with the given stop words, using the default DIN-5007-1 stemmer

Parameters

matchVersion	Lucene compatibility version
stopwords	a stopword set

Definition at line 128 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer	(	Version	matchVersion,
		ISet< string >	stopwords,
		bool	normalizeDin2
	)

Builds an analyzer with the given stop words

Parameters

matchVersion	Lucene compatibility version
stopwords	a stopword set
normalizeDin2	Specifies if the DIN-2007-2 style stemmer should be used in addition to DIN1. This will cause words with 'ae', 'ue', or 'oe' in them (expanded umlauts) to be first converted to 'a', 'u', and 'o' respectively, before the DIN1 stemmer is invoked.

Definition at line 141 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer	(	Version	matchVersion,
		ISet< string >	stopwords,
		ISet< string >	stemExclusionSet
	)

Builds an analyzer with the given stop words, using the default DIN-5007-1 stemmer

Parameters

matchVersion	lucene compatibility version
stopwords	a stopword set
stemExclusionSet	a stemming exclusion set

Definition at line 152 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer	(	Version	matchVersion,
		ISet< string >	stopwords,
		ISet< string >	stemExclusionSet,
		bool	normalizeDin2
	)

Builds an analyzer with the given stop words

Parameters

matchVersion	lucene compatibility version
stopwords	a stopword set
stemExclusionSet	a stemming exclusion set
normalizeDin2	Specifies if the DIN-2007-2 style stemmer should be used in addition to DIN1. This will cause words with 'ae', 'ue', or 'oe' in them (expanded umlauts) to be first converted to 'a', 'u', and 'o' respectively, before the DIN1 stemmer is invoked.

Definition at line 166 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer	(	Version	matchVersion,
		params string[]	stopwords
	)

Builds an analyzer with the given stop words.

Parameters

stopwords

Definition at line 180 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer	(	Version	matchVersion,
		IDictionary< string, string >	stopwords
	)

Builds an analyzer with the given stop words.

Definition at line 189 of file GermanAnalyzer.cs.

Lucene.Net.Analysis.De.GermanAnalyzer.GermanAnalyzer	(	Version	matchVersion,
		FileInfo	stopwords
	)

Builds an analyzer with the given stop words.

Definition at line 199 of file GermanAnalyzer.cs.

Member Function Documentation

static ISet<string> Lucene.Net.Analysis.De.GermanAnalyzer.GetDefaultStopSet ( )

static

Returns a set of default German-stopwords

Definition at line 65 of file GermanAnalyzer.cs.

void Lucene.Net.Analysis.De.GermanAnalyzer.SetStemExclusionTable ( String[] exclusionlist )

Builds an exclusionlist from an array of Strings.

Definition at line 208 of file GermanAnalyzer.cs.

void Lucene.Net.Analysis.De.GermanAnalyzer.SetStemExclusionTable ( IDictionary< string, string > exclusionlist )

Builds an exclusionlist from a IDictionary.

Definition at line 218 of file GermanAnalyzer.cs.

void Lucene.Net.Analysis.De.GermanAnalyzer.SetStemExclusionTable ( FileInfo exclusionlist )

Builds an exclusionlist from the words contained in the given file.

Definition at line 228 of file GermanAnalyzer.cs.

override TokenStream Lucene.Net.Analysis.De.GermanAnalyzer.TokenStream	(	String	fieldName,
		TextReader	reader
	)

Creates a TokenStream which tokenizes all the text in the provided TextReader.

Parameters

fieldName
reader

Returns: A TokenStream build from a StandardTokenizer filtered with StandardFilter, StopFilter, GermanStemFilter

Definition at line 240 of file GermanAnalyzer.cs.

The documentation for this class was generated from the following file:

contrib/Analyzers/De/GermanAnalyzer.cs

Public Member Functions

Static Public Member Functions

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation