Class NLTKNERecogniser

java.lang.Object
org.apache.tika.parser.ner.nltk.NLTKNERecogniser
All Implemented Interfaces:
NERecogniser

public class NLTKNERecogniser extends Object implements NERecogniser
This class offers an implementation of NERecogniser based on ne_chunk() module of NLTK. This NER requires additional setup, due to Http requests to an endpoint server that runs NLTK. See
  • Field Details

    • ENTITY_TYPES

      public static final Set<String> ENTITY_TYPES
      some common entities identified by NLTK
  • Constructor Details

    • NLTKNERecogniser

      public NLTKNERecogniser()
  • Method Details

    • isAvailable

      public boolean isAvailable()
      Description copied from interface: NERecogniser
      checks if this Named Entity recogniser is available for service
      Specified by:
      isAvailable in interface NERecogniser
      Returns:
      true if server endpoint is available. returns false if server endpoint is not avaliable for service.
    • getEntityTypes

      public Set<String> getEntityTypes()
      Gets set of entity types recognised by this recogniser
      Specified by:
      getEntityTypes in interface NERecogniser
      Returns:
      set of entity classes/types
    • recognise

      public Map<String,Set<String>> recognise(String text)
      recognises names of entities in the text
      Specified by:
      recognise in interface NERecogniser
      Parameters:
      text - text which possibly contains names
      Returns:
      map of entity type -> set of names