public final class OpenNLPTokenizer extends SegmentingTokenizerBase
EOS_FLAG_BIT
in the FlagsAttribute;
following filters can use this information to apply operations to tokens one sentence at a time.AttributeSource.State
Modifier and Type | Field and Description |
---|---|
static int |
EOS_FLAG_BIT |
buffer, BUFFERMAX, offset
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
OpenNLPTokenizer(AttributeFactory factory,
NLPSentenceDetectorOp sentenceOp,
NLPTokenizerOp tokenizerOp) |
Modifier and Type | Method and Description |
---|---|
void |
close() |
protected boolean |
incrementWord() |
void |
reset() |
protected void |
setNextSentence(int sentenceStart,
int sentenceEnd) |
end, incrementToken, isSafeEnd
correctOffset, setReader
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public OpenNLPTokenizer(AttributeFactory factory, NLPSentenceDetectorOp sentenceOp, NLPTokenizerOp tokenizerOp) throws IOException
IOException
public void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
close
in class Tokenizer
IOException
protected void setNextSentence(int sentenceStart, int sentenceEnd)
setNextSentence
in class SegmentingTokenizerBase
protected boolean incrementWord()
incrementWord
in class SegmentingTokenizerBase
public void reset() throws IOException
reset
in class SegmentingTokenizerBase
IOException
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.