Class CTAKESUtils

java.lang.Object
org.apache.tika.parser.ctakes.CTAKESUtils

public class CTAKESUtils extends Object
This class provides methods to extract biomedical information from plain text using CTAKESContentHandler that relies on Apache cTAKES.

Apache cTAKES is built on top of Apache UIMA framework and OpenNLP toolkit.

  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static org.apache.uima.analysis_engine.AnalysisEngine
    getAnalysisEngine(String aeDescriptor, String umlsUser, String umlsPass)
    Returns a new UIMA Analysis Engine (AE).
    static String
    getAnnotationProperty(org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation annotation, CTAKESAnnotationProperty property)
    Returns the annotation value based on the given annotation type.
    static org.apache.uima.jcas.JCas
    getJCas(org.apache.uima.analysis_engine.AnalysisEngine ae)
    Returns a new JCas () appropriate for the given Analysis Engine.
    static void
    reset(org.apache.uima.analysis_engine.AnalysisEngine ae, org.apache.uima.jcas.JCas jcas)
    Resets cTAKES objects, if created.
    static void
    resetAE(org.apache.uima.analysis_engine.AnalysisEngine ae)
    Resets the AE (AnalysisEngine), releasing all resources held by the current AE.
    static void
    resetCAS(org.apache.uima.jcas.JCas jcas)
    Resets the CAS (Common Analysis System), emptying it of all content.
    static void
    serialize(org.apache.uima.jcas.JCas jcas, CTAKESSerializer type, boolean prettyPrint, OutputStream stream)
    Serializes a CAS in the given format.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • CTAKESUtils

      public CTAKESUtils()
  • Method Details

    • getAnalysisEngine

      public static org.apache.uima.analysis_engine.AnalysisEngine getAnalysisEngine(String aeDescriptor, String umlsUser, String umlsPass) throws IOException, org.apache.uima.util.InvalidXMLException, org.apache.uima.resource.ResourceInitializationException, URISyntaxException
      Returns a new UIMA Analysis Engine (AE). This method ensures that only one instance of an AE is created.

      An Analysis Engine is a component responsible for analyzing unstructured information, discovering and representing semantic content. Unstructured information includes, but is not restricted to, text documents.

      Parameters:
      aeDescriptor - pathname for XML file including an AnalysisEngineDescription that contains all of the information needed to instantiate and use an AnalysisEngine.
      umlsUser - UMLS username for NLM database
      umlsPass - UMLS password for NLM database
      Returns:
      an Analysis Engine for analyzing unstructured information.
      Throws:
      IOException - if any I/O error occurs.
      org.apache.uima.util.InvalidXMLException - if the input XML is not valid or does not specify a valid ResourceSpecifier.
      org.apache.uima.resource.ResourceInitializationException - if a failure occurred during production of the resource.
      URISyntaxException - if URL of the resource is not formatted strictly according to RFC2396 and cannot be converted to a URI.
    • getJCas

      public static org.apache.uima.jcas.JCas getJCas(org.apache.uima.analysis_engine.AnalysisEngine ae) throws org.apache.uima.resource.ResourceInitializationException
      Returns a new JCas () appropriate for the given Analysis Engine. This method ensures that only one instance of a JCas is created. A Jcas is a Java Cover Classes based Object-oriented CAS (Common Analysis System) API.

      Important: It is highly recommended that you reuse CAS objects rather than creating new CAS objects prior to each analysis. This is because CAS objects may be expensive to create and may consume a significant amount of memory.

      Parameters:
      ae - AnalysisEngine used to create an appropriate JCas object.
      Returns:
      a JCas object appropriate for the given AnalysisEngine.
      Throws:
      org.apache.uima.resource.ResourceInitializationException - if a CAS could not be created because this AnalysisEngine's CAS metadata (type system, type priorities, or FS indexes) are invalid.
    • serialize

      public static void serialize(org.apache.uima.jcas.JCas jcas, CTAKESSerializer type, boolean prettyPrint, OutputStream stream) throws SAXException, IOException
      Serializes a CAS in the given format.
      Parameters:
      jcas - CAS (Common Analysis System) to be serialized.
      type - type of cTAKES (UIMA) serializer used to write CAS.
      prettyPrint - true to do pretty printing of output.
      stream - OutputStream object used to print out information extracted by using cTAKES.
      Throws:
      SAXException - if there was a SAX exception.
      IOException - if any I/O error occurs.
    • getAnnotationProperty

      public static String getAnnotationProperty(org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation annotation, CTAKESAnnotationProperty property)
      Returns the annotation value based on the given annotation type.
      Parameters:
      annotation - IdentifiedAnnotation object.
      property - CTAKESAnnotationProperty enum used to identify the annotation type.
      Returns:
      the annotation value.
    • reset

      public static void reset(org.apache.uima.analysis_engine.AnalysisEngine ae, org.apache.uima.jcas.JCas jcas)
      Resets cTAKES objects, if created. This method ensures that new cTAKES objects (a.k.a., Analysis Engine and JCas) will be created if getters of this class are called.
      Parameters:
      ae - UIMA Analysis Engine
      jcas - JCas object
    • resetCAS

      public static void resetCAS(org.apache.uima.jcas.JCas jcas)
      Resets the CAS (Common Analysis System), emptying it of all content.
      Parameters:
      jcas - JCas object
    • resetAE

      public static void resetAE(org.apache.uima.analysis_engine.AnalysisEngine ae)
      Resets the AE (AnalysisEngine), releasing all resources held by the current AE.
      Parameters:
      ae - UIMA Analysis Engine