Token Methods

The methods of the Token class are listed below. For a complete list of Token class members, see the Token Members topic.

Public Instance Methods

Clear	Resets the term text, payload, and positionIncrement to default. Other fields such as startOffset, endOffset and the token type are not reset since they are normally overwritten by the tokenizer.
Clone	Overloaded.
EndOffset	Returns this Token's ending offset, one greater than the position of the last character corresponding to this token in the source text. The length of the token in the source text is (endOffset - startOffset).
Equals
GetFlags	EXPERIMENTAL: While we think this is here to stay, we may want to change it to be a long. Get the bitset for any bits that have been set. This is completely distinct from {@link #type()}, although they do share similar purposes. The flags can be used to encode information about the token for use by other {@link org.apache.lucene.analysis.TokenFilter}s.
GetHashCode
GetPayload	Returns this Token's payload.
GetPositionIncrement	Returns the position increment of this Token.
GetType (inherited from Object)	Gets the Type of the current instance.
Reinit	Overloaded.
ResizeTermBuffer	Grows the termBuffer to at least size newSize, preserving the existing content. Note: If the next operation is to change the contents of the term buffer use {@link #setTermBuffer(char[], int, int)}, {@link #setTermBuffer(String)}, or {@link #setTermBuffer(String, int, int)}, to optimally combine the resize with the setting of the termBuffer.
SetEndOffset	Set the ending offset.
SetFlags
SetPayload	Sets this Token's payload.
SetPositionIncrement
SetStartOffset	Set the starting offset.
SetTermBuffer	Overloaded. Copies the contents of buffer, starting at offset for length characters, into the termBuffer array.
SetTermLength	Set number of valid characters (length of the term) in the termBuffer array. Use this to truncate the termBuffer or to synchronize with external manipulation of the termBuffer. Note: to grow the size of the array use {@link #resizeTermBuffer(int)} first.
SetTermText	Sets the Token's term text. NOTE: for better indexing speed you should instead use the char[] termBuffer methods to set the term text.
SetType	Set the lexical type.
StartOffset	Returns this Token's starting offset, the position of the first character corresponding to this token in the source text. Note that the difference between endOffset() and startOffset() may not be equal to termText.length(), as the term text may have been altered by a stemmer or some other filter.
Term	Returns the Token's term text. This method has a performance penalty because the text is stored internally in a char[]. If possible, use {@link #termBuffer()} and {@link #termLength()} directly instead. If you really need a string, use this method which is nothing more than a convenience cal to new String(token.TermBuffer(), o, token.TermLength()).
TermBuffer	Returns the internal termBuffer character array which you can then directly alter. If the array is too small for your token, use {@link #ResizeTermBuffer(int)} to increase it. After altering the buffer be sure to call {@link #setTermLength} to record the number of valid characters that were placed into the termBuffer.
TermLength	Return number of valid characters (length of the term) in the termBuffer array.
TermText	Returns the Token's term text. This method has a performance penalty because the text is stored internally in a char[]. If possible, use {@link #termBuffer()} and {@link #termLength()} directly instead. If you really need a string, use {@link #Term()}.
ToString
Type	Returns this Token's lexical type. Defaults to "word".

Protected Instance Methods

Finalize (inherited from Object)	Allows an Object to attempt to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection.
MemberwiseClone (inherited from Object)	Creates a shallow copy of the current Object.

Token Methods

Public Instance Methods

Protected Instance Methods

See Also