Class | Description |
---|---|
DecompoundToken |
A token that was generated from a compound.
|
DictionaryToken |
A token stored in a
Dictionary . |
GraphvizFormatter |
Outputs the dot (graphviz) string for the viterbi lattice.
|
KoreanAnalyzer |
Analyzer for Korean that uses morphological analysis.
|
KoreanNumberFilter |
A
TokenFilter that normalizes Korean numbers to regular Arabic
decimal numbers in half-width characters. |
KoreanNumberFilter.NumberBuffer |
Buffer that holds a Korean number string and a position index used as a parsed-to marker
|
KoreanNumberFilterFactory |
Factory for
KoreanNumberFilter . |
KoreanPartOfSpeechStopFilter |
Removes tokens that match a set of part-of-speech tags.
|
KoreanPartOfSpeechStopFilterFactory |
Factory for
KoreanPartOfSpeechStopFilter . |
KoreanReadingFormFilter |
Replaces term text with the
ReadingAttribute which is
the Hangul transcription of Hanja characters. |
KoreanReadingFormFilterFactory |
Factory for
KoreanReadingFormFilter . |
KoreanTokenizer |
Tokenizer for Korean that uses morphological analysis.
|
KoreanTokenizerFactory |
Factory for
KoreanTokenizer . |
POS |
Part of speech classification for Korean based on Sejong corpus classification.
|
Token |
Analyzed token with morphological data.
|
Enum | Description |
---|---|
KoreanTokenizer.DecompoundMode |
Decompound mode: this determines how the tokenizer handles
POS.Type.COMPOUND , POS.Type.INFLECT and POS.Type.PREANALYSIS tokens. |
KoreanTokenizer.Type |
Token type reflecting the original source of this token
|
POS.Tag |
Part of speech tag for Korean based on Sejong corpus classification.
|
POS.Type |
The type of the token.
|
Copyright © 2000-2024 Apache Software Foundation. All Rights Reserved.