Package org.apache.tika.langdetect.tika
Class LanguageProfile
java.lang.Object
org.apache.tika.langdetect.tika.LanguageProfile
Language profile based on ngram counts.
- Since:
- Apache Tika 0.5
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
static boolean
-
Constructor Summary
ConstructorDescriptionLanguageProfile
(int length) LanguageProfile
(String content) LanguageProfile
(String content, int length) -
Method Summary
Modifier and TypeMethodDescriptionvoid
Adds a single occurrence of the given ngram to this profile.void
Adds multiple occurrences of the given ngram to this profile.double
distance
(LanguageProfile that) Calculates the geometric distance between this and the given other language profile.long
getCount()
long
toString()
-
Field Details
-
DEFAULT_NGRAM_LENGTH
public static final int DEFAULT_NGRAM_LENGTH- See Also:
-
useInterleaved
public static boolean useInterleaved
-
-
Constructor Details
-
LanguageProfile
public LanguageProfile(int length) -
LanguageProfile
public LanguageProfile() -
LanguageProfile
-
LanguageProfile
-
-
Method Details
-
getCount
public long getCount() -
getCount
-
add
Adds a single occurrence of the given ngram to this profile.- Parameters:
ngram
- the ngram
-
add
Adds multiple occurrences of the given ngram to this profile.- Parameters:
ngram
- the ngramcount
- number of occurrences to add
-
distance
Calculates the geometric distance between this and the given other language profile.- Parameters:
that
- the other language profile- Returns:
- distance between the profiles
-
toString
-