Apache Lucene.Net 2.4.0 Class Library API

Token Methods

The methods of the Token class are listed below. For a complete list of Token class members, see the Token Members topic.

Public Instance Methods

ClearResets the term text, payload, and positionIncrement to default. Other fields such as startOffset, endOffset and the token type are not reset since they are normally overwritten by the tokenizer.
CloneOverloaded.  
EndOffsetReturns this Token's ending offset, one greater than the position of the last character corresponding to this token in the source text. The length of the token in the source text is (endOffset - startOffset).
Equals 
GetFlags EXPERIMENTAL: While we think this is here to stay, we may want to change it to be a long. Get the bitset for any bits that have been set. This is completely distinct from {@link #type()}, although they do share similar purposes. The flags can be used to encode information about the token for use by other {@link org.apache.lucene.analysis.TokenFilter}s.
GetHashCode 
GetPayload Returns this Token's payload.
GetPositionIncrementReturns the position increment of this Token.
GetType (inherited from Object)Gets the Type of the current instance.
ReinitOverloaded.  
ResizeTermBuffer Grows the termBuffer to at least size newSize, preserving the existing content. Note: If the next operation is to change the contents of the term buffer use {@link #setTermBuffer(char[], int, int)}, {@link #setTermBuffer(String)}, or {@link #setTermBuffer(String, int, int)}, to optimally combine the resize with the setting of the termBuffer.
SetEndOffsetSet the ending offset.
SetFlags 
SetPayload Sets this Token's payload.
SetPositionIncrement 
SetStartOffsetSet the starting offset.
SetTermBufferOverloaded. Copies the contents of buffer, starting at offset for length characters, into the termBuffer array.
SetTermLengthSet number of valid characters (length of the term) in the termBuffer array. Use this to truncate the termBuffer or to synchronize with external manipulation of the termBuffer. Note: to grow the size of the array use {@link #resizeTermBuffer(int)} first.
SetTermTextSets the Token's term text. NOTE: for better indexing speed you should instead use the char[] termBuffer methods to set the term text.
SetTypeSet the lexical type.
StartOffsetReturns this Token's starting offset, the position of the first character corresponding to this token in the source text. Note that the difference between endOffset() and startOffset() may not be equal to termText.length(), as the term text may have been altered by a stemmer or some other filter.
Term Returns the Token's term text. This method has a performance penalty because the text is stored internally in a char[]. If possible, use {@link #termBuffer()} and {@link #termLength()} directly instead. If you really need a string, use this method which is nothing more than a convenience cal to new String(token.TermBuffer(), o, token.TermLength()).
TermBufferReturns the internal termBuffer character array which you can then directly alter. If the array is too small for your token, use {@link #ResizeTermBuffer(int)} to increase it. After altering the buffer be sure to call {@link #setTermLength} to record the number of valid characters that were placed into the termBuffer.
TermLengthReturn number of valid characters (length of the term) in the termBuffer array.
TermText Returns the Token's term text. This method has a performance penalty because the text is stored internally in a char[]. If possible, use {@link #termBuffer()} and {@link #termLength()} directly instead. If you really need a string, use {@link #Term()}.
ToString 
TypeReturns this Token's lexical type. Defaults to "word".

Protected Instance Methods

Finalize (inherited from Object)Allows an Object to attempt to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection.
MemberwiseClone (inherited from Object)Creates a shallow copy of the current Object.

See Also

Token Class | Lucene.Net.Analysis Namespace