Class LanguageProfile

java.lang.Object
org.apache.tika.langdetect.tika.LanguageProfile

public class LanguageProfile extends Object
Language profile based on ngram counts.
Since:
Apache Tika 0.5
  • Field Details

    • DEFAULT_NGRAM_LENGTH

      public static final int DEFAULT_NGRAM_LENGTH
      See Also:
    • useInterleaved

      public static boolean useInterleaved
  • Constructor Details

    • LanguageProfile

      public LanguageProfile(int length)
    • LanguageProfile

      public LanguageProfile()
    • LanguageProfile

      public LanguageProfile(String content, int length)
    • LanguageProfile

      public LanguageProfile(String content)
  • Method Details

    • getCount

      public long getCount()
    • getCount

      public long getCount(String ngram)
    • add

      public void add(String ngram)
      Adds a single occurrence of the given ngram to this profile.
      Parameters:
      ngram - the ngram
    • add

      public void add(String ngram, long count)
      Adds multiple occurrences of the given ngram to this profile.
      Parameters:
      ngram - the ngram
      count - number of occurrences to add
    • distance

      public double distance(LanguageProfile that)
      Calculates the geometric distance between this and the given other language profile.
      Parameters:
      that - the other language profile
      Returns:
      distance between the profiles
    • toString

      public String toString()
      Overrides:
      toString in class Object