org.apache.lucene.analysis.cn.smart.hhmm
Class SegTokenFilter
java.lang.Object
org.apache.lucene.analysis.cn.smart.hhmm.SegTokenFilter
public class SegTokenFilter
- extends Object
Filters a SegToken
by converting full-width latin to half-width, then lowercasing latin.
Additionally, all punctuation is converted into Utility.COMMON_DELIMITER
WARNING: The status of the analyzers/smartcn analysis.cn.smart package is experimental.
The APIs and file formats introduced here might change in the future and will not be
supported anymore in such a case.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SegTokenFilter
public SegTokenFilter()
filter
public SegToken filter(SegToken token)
- Filter an input
SegToken
Full-width latin will be converted to half-width, then all latin will be lowercased.
All punctuation is converted into Utility.COMMON_DELIMITER
- Parameters:
token
- input SegToken
- Returns:
- normalized
SegToken
Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.