|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.nutch.analysis.lang.LanguageIdentifier
public class LanguageIdentifier
Identify the language of a content, based on statistical analysis.
Constructor Summary | |
---|---|
LanguageIdentifier(Configuration conf)
Constructs a new Language Identifier. |
Method Summary | |
---|---|
String |
identify(InputStream is)
Identify language from input stream. |
String |
identify(InputStream is,
String charset)
Identify language from input stream. |
String |
identify(String content)
Identify language of a content. |
String |
identify(StringBuilder content)
Identify language of a content. |
static void |
main(String[] args)
Main method used for command line process. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public LanguageIdentifier(Configuration conf)
Method Detail |
---|
public static void main(String[] args)
LanguageIdentifier [-identifyrows filename maxlines] [-identifyfile charset filename] [-identifyfileset charset files] [-identifytext text] [-identifyurl url]
args
- arguments.public String identify(String content)
content
- is the content to analyze.
public String identify(StringBuilder content)
content
- is the content to analyze.
public String identify(InputStream is) throws IOException
identify(InputStream, String)
method.
is
- is the input stream to analyze.
IOException
- if something wrong occurs on the input stream.public String identify(InputStream is, String charset) throws IOException
is
- is the input stream to analyze.charset
- is the charset to use to read the input stream.
IOException
- if something wrong occurs on the input stream.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |