Class CoreNLPNERecogniser

java.lang.Object
org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
All Implemented Interfaces:
NERecogniser

public class CoreNLPNERecogniser extends Object implements NERecogniser
This class offers an implementation of NERecogniser based on CRF classifiers from Stanford CoreNLP. This NER requires additional setup, due to runtime binding to Stanford CoreNLP. See Tika NER Wiki for configuring this recogniser.
See Also:
  • Field Details

  • Constructor Details

    • CoreNLPNERecogniser

      public CoreNLPNERecogniser()
    • CoreNLPNERecogniser

      public CoreNLPNERecogniser(String modelPath)
      Creates a NERecogniser by loading model from given path
      Parameters:
      modelPath - path to NER model file
  • Method Details

    • main

      public static void main(String[] args) throws IOException, com.github.openjson.JSONException
      Throws:
      IOException
      com.github.openjson.JSONException
    • isAvailable

      public boolean isAvailable()
      Description copied from interface: NERecogniser
      checks if this Named Entity recogniser is available for service
      Specified by:
      isAvailable in interface NERecogniser
      Returns:
      true if model was available, valid and was able to initialise the classifier. returns false when this recogniser is not available for service.
    • getEntityTypes

      public Set<String> getEntityTypes()
      Gets set of entity types recognised by this recogniser
      Specified by:
      getEntityTypes in interface NERecogniser
      Returns:
      set of entity classes/types
    • recognise

      public Map<String,Set<String>> recognise(String text)
      recognises names of entities in the text
      Specified by:
      recognise in interface NERecogniser
      Parameters:
      text - text which possibly contains names
      Returns:
      map of entity type -> set of names