UIMA Tutorial and Developers' Guides
UIMA Tutorial and Developers' Guides
Annotator and Analysis Engine Developer's Guide
Chapter 1, Annotator and Analysis Engine Developer's Guide
Getting Started
Section 1.1, “Getting Started”
Defining Types
Section 1.1.1, “Defining Types”
Generating Java Source Files for CAS Types
Section 1.1.2, “Generating Java Source Files for CAS Types”
Developing Your Annotator Code
Section 1.1.3, “Developing Your Annotator Code”
Creating the XML Descriptor
Section 1.1.4, “Creating the XML Descriptor”
Testing Your Annotator
Section 1.1.5, “Testing Your Annotator”
Configuration and Logging
Section 1.2, “Configuration and Logging”
Configuration Parameters
Section 1.2.1, “Configuration Parameters”
Declaring Parameters in the Descriptor
Section 1.2.1.1, “Declaring Parameters in the Descriptor”
Accessing Parameter Values from the Annotator Code
Section 1.2.1.2, “Accessing Parameter Values from the Annotator Code”
Supporting Reconfiguration
Section 1.2.1.3, “Supporting Reconfiguration”
Configuration Parameter Groups
Section 1.2.1.4, “Configuration Parameter Groups”
Logging
Section 1.2.2, “Logging”
Specifying the Logging Configuration
Section 1.2.2.1, “Specifying the Logging Configuration”
Setting Logging Levels
Section 1.2.2.2, “Setting Logging Levels”
Format of logging output
Section 1.2.2.3, “Format of logging output”
Meaning of the logging severity levels
Section 1.2.2.4, “Meaning of the logging severity levels”
Using the logger outside of an annotator
Section 1.2.2.5, “Using the logger outside of an annotator”
Building Aggregate Analysis Engines
Section 1.3, “Building Aggregate Analysis Engines”
Combining Annotators
Section 1.3.1, “Combining Annotators”
Combining Annotators to form an Aggregate Analysis Engine
Figure 1.1, “Combining Annotators to form an Aggregate Analysis Engine”
AEs can also contain CAS Consumers
Section 1.3.2, “AEs can also contain CAS Consumers”
Reading the Results of Previous Annotators
Section 1.3.3, “Reading the Results of Previous Annotators”
An Aggregate Analysis Engine where an internal component uses output from previous
engines
Figure 1.2, “An Aggregate Analysis Engine where an internal component uses output from previous
engines”
Other examples
Section 1.4, “Other examples”
Additional Topics
Section 1.5, “Additional Topics”
Contract: Annotator Methods Called by the Framework
Section 1.5.1, “Annotator Methods”
Reporting errors from Annotators
Section 1.5.2, “Reporting errors from Annotators”
Throwing Exceptions from Annotators
Section 1.5.3, “Throwing Exceptions from Annotators”
Accessing External Resource Files
Section 1.5.4, “Accessing External Resource Files”
Declaring Resource Dependencies
Section 1.5.4.1, “Declaring Resource Dependencies”
Accessing the Resource from the UimaContext
Section 1.5.4.2, “Accessing the Resource from the UimaContext”
Declaring Resources and Bindings
Section 1.5.4.3, “Declaring Resources and Bindings”
External Resource Binding
Figure 1.3, “External Resource Binding”
Sharing Resources among Annotators
Section 1.5.4.4, “Sharing Resources among Annotators”
Component engines of an aggregate share a common resource
Figure 1.4, “Component engines of an aggregate share a common resource”
Result Specifications
Section 1.5.5, “Result Specifications”
Default ResultSpecification
Section 1.5.5.1, “Default ResultSpecification”
Passing Result Specifications to Annotators
Section 1.5.5.2, “Passing Result Specifications to Annotators”
Aggregates
Section 1.5.5.3, “Aggregates”
Collection Proessing Engines
Section 1.5.5.4, “Collection Proessing Engines”
Class path setup when using JCas
Section 1.5.6, “Class path setup when using JCas”
Using the Shell Scripts
Section 1.5.7, “Using the Shell Scripts”
Environment variables used by the shell scripts
Table 1.1, “Environment variables used by the shell scripts”
Common Pitfalls
Section 1.6, “Common Pitfalls”
Viewing UIMA objects in the Eclipse debugger
Section 1.7, “UIMA Objects in Eclipse Debugger”
Introduction to Analysis Engine Descriptor XML Syntax
Section 1.8, “Analysis Engine XML Descriptor”
Header and Annotator Class Identification
Section 1.8.1, “Header and Annotator Class Identification”
Simple Metadata Attributes
Section 1.8.2, “Simple Metadata Attributes”
Type System Definition
Section 1.8.3, “Type System Definition”
Capabilities
Section 1.8.4, “Capabilities”
Configuration Parameters (Optional)
Section 1.8.5, “Configuration Parameters (Optional)”
Configuration Parameter Declarations
Section 1.8.5.1, “Configuration Parameter Declarations”
Configuration Parameter Settings
Section 1.8.5.2, “Configuration Parameter Settings”
Aggregate Analysis Engine Descriptor
Section 1.8.5.3, “Aggregate Analysis Engine Descriptor”
Collection Processing Engine Developer's Guide
Chapter 2, Collection Processing Engine Developer's Guide
CPE Concepts
Section 2.1, “CPE Concepts”
CPE Components
Figure 2.1, “CPE Components”
CPE Configurator and CAS viewer
Section 2.2, “CPE Configurator and CAS viewer”
Using the CPE Configurator
Section 2.2.1, “Using the CPE Configurator”
Running the CPE Configurator from Eclipse
Section 2.2.2, “Running the CPE Configurator from Eclipse”
Running a CPE from Your Own Java Application
Section 2.3, “Running a CPE from Your Own Java Application”
Using Listeners
Section 2.3.1, “Using Listeners”
Developing Collection Processing Components
Section 2.4, “Developing Collection Processing Components”
Developing Collection Readers
Section 2.4.1, “Developing Collection Readers”
Java Class for the Collection Reader
Section 2.4.1.1, “Java Class for the Collection Reader”
Required Methods in the Collection Reader class
Section 2.4.1.2, “Required Methods in the Collection Reader class”
initialize()
the section called “initialize()”
hasNext()
the section called “hasNext()”
getNext(CAS)
the section called “getNext(CAS)”
getProgress()
the section called “getProgress()”
close()
the section called “close()”
Optional Methods
the section called “Optional Methods”
reconfigure()
the section called “reconfigure()”
typeSystemInit()
the section called “typeSystemInit()”
Threading considerations
the section called “Threading considerations”
XML Descriptor for a Collection Reader
the section called “XML Descriptor for a Collection Reader”
Developing CAS
Initializers
Section 2.4.2, “Developing CAS
Initializers”
Developing CAS
Consumers
Section 2.4.3, “Developing CAS
Consumers”
Required Methods for a CAS Consumer
Section 2.4.3.1, “Required Methods for a CAS Consumer”
initialize()
the section called “initialize()”
processCas()
the section called “processCas()”
Optional Methods
the section called “Optional Methods”
batchProcessComplete()
the section called “batchProcessComplete()”
collectionProcessComplete()
the section called “collectionProcessComplete()”
Deploying a CPE
Section 2.5, “Deploying a CPE”
CPE Instantiation
Figure 2.2, “CPE Instantiation”
???TITLE???
Section 2.5, “Deploying a CPE”
Deploying Managed CAS Processors
Section 2.5.1, “Deploying Managed CAS Processors”
CPE with Managed CAS Processors
Figure 2.3, “CPE with Managed CAS Processors”
Deploying Non-managed CAS Processors
Section 2.5.2, “Deploying Non-managed CAS Processors”
CPE with non-managed CAS Processors
Figure 2.4, “CPE with non-managed CAS Processors”
Deploying Integrated CAS Processors
Section 2.5.3, “Deploying Integrated CAS Processors”
CPE with integrated CAS Processor
Figure 2.5, “CPE with integrated CAS Processor”
Collection Processing Examples
Section 2.6, “Collection Processing Examples”
Application Developer's Guide
Chapter 3, Application Developer's Guide
The UIMAFramework Class
Section 3.1, “The UIMAFramework Class”
Using Analysis Engines
Section 3.2, “Using Analysis Engines”
Instantiating an Analysis Engine
Section 3.2.1, “Instantiating an Analysis Engine”
Analyzing Text Documents
Section 3.2.2, “Analyzing Text Documents”
Analyzing Non-Text Artifacts
Section 3.2.3, “Analyzing Non-Text Artifacts”
Accessing Analysis Results
Section 3.2.4, “Accessing Analysis Results”
Accessing Analysis Results using the JCas
Section 3.2.4.1, “Accessing Analysis Results using the JCas”
Accessing Analysis Results using the CAS
Section 3.2.4.2, “Accessing Analysis Results using the CAS”
Multi-threaded Applications
Section 3.2.5, “Multi-threaded Applications”
Using Multiple Analysis Engines and Creating Shared CASes
Section 3.2.6, “Multiple AEs & Creating Shared CASes”
Saving CASes to file systems
Section 3.2.7, “Saving CASes to file systems”
Using Collection Processing Engines
Section 3.3, “Using Collection Processing Engines”
Running a Collection Processing Engine from a Descriptor
Section 3.3.1, “Running a CPE from a Descriptor”
Configuring a Collection Processing Engine Descriptor Programmatically
Section 3.3.2, “Configuring a CPE Descriptor Programmatically”
Setting Configuration Parameters
Section 3.4, “Setting Configuration Parameters”
Integrating Text Analysis and Search
Section 3.5, “Integrating Text Analysis and Search”
Building an Index
Section 3.5.1, “Building an Index”
Configuring the Semantic Search CAS Indexer
Section 3.5.1.1, “Configuring the Semantic Search CAS Indexer”
Building and Running a CPE including the Semantic Search CAS Indexer
Section 3.5.1.2, “Using Semantic Search CAS Indexer”
Semantic Search Query Tool
Section 3.5.2, “Semantic Search Query Tool”
Working with Remote Services
Section 3.6, “Working with Remote Services”
Deploying a UIMA Component as a SOAP Service
Section 3.6.1, “Deploying as SOAP Service”
Deploying a UIMA Component as a Vinci Service
Section 3.6.2, “Deploying as a Vinci Service”
How to Call a UIMA Service
Section 3.6.3, “Calling a UIMA Service”
SOAP Service Client Descriptor
Section 3.6.3.1, “SOAP Service Client Descriptor”
Vinci Service Client Descriptor
Section 3.6.3.2, “Vinci Service Client Descriptor”
Restrictions on remotely deployed services
Section 3.6.4, “Restrictions on remotely deployed services”
The Vinci Naming Services (VNS)
Section 3.6.5, “The Vinci Naming Services (VNS)”
Starting VNS
Section 3.6.5.1, “Starting VNS”
VNS Files
Section 3.6.5.2, “VNS Files”
Launching Vinci Services
Section 3.6.5.3, “Launching Vinci Services”
Configuring Timeout Settings
Section 3.6.6, “Configuring Timeout Settings”
Setting the Client Timeout
Section 3.6.6.1, “Setting the Client Timeout”
Setting the Server Socket Timeout
Section 3.6.6.2, “Setting the Server Socket Timeout”
Increasing performance using parallelism
Section 3.7, “Increasing performance using parallelism”
Monitoring AE Performance using JMX
Section 3.8, “Monitoring AE Performance using JMX”
Flow Controller Developer's Guide
Chapter 4, Flow Controller Developer's Guide
Developing the Flow Controller Code
Section 4.1, “Developing the Flow Controller Code”
Flow Controller Interface Overview
Section 4.1.1, “Flow Controller Interface Overview”
Example Code
Section 4.1.2, “Example Code”
The WhiteboardFlowController Class
Section 4.1.2.1, “The WhiteboardFlowController Class”
The WhiteboardFlow Class
Section 4.1.2.2, “The WhiteboardFlow Class”
Creating the Flow Controller Descriptor
Section 4.2, “Creating the Flow Controller Descriptor”
Adding a Flow Controller to an Aggregate Analysis Engine
Section 4.3, “Adding Flow Controller to an Aggregate”
Adding a Flow Controller to a Collection Processing Engine
Section 4.4, “Adding Flow Controller to CPE”
Using Flow Controllers with CAS Multipliers
Section 4.5, “Using Flow Controllers with CAS Multipliers”
Annotations, Artifacts, and Sofas
Chapter 5, Annotations, Artifacts, and Sofas
Terminology
Section 5.1, “Terminology”
Artifact
Section 5.1.1, “Artifact”
Subject of Analysis — Sofa
Section 5.1.2, “Subject of Analysis — Sofa”
Formats of Sofa Data
Section 5.2, “Formats of Sofa Data”
Setting and Accessing Sofa Data
Section 5.3, “Setting and Accessing Sofa Data”
Setting Sofa Data
Section 5.3.1, “Setting Sofa Data”
Accessing Sofa Data
Section 5.3.2, “Accessing Sofa Data”
Accessing Sofa Data using a Java Stream
Section 5.3.3, “Accessing Sofa Data using a Java Stream”
The Sofa Feature Structure
Section 5.4, “The Sofa Feature Structure”
Annotations
Section 5.5, “Annotations”
Built-in Annotation types
Section 5.5.1, “Built-in Annotation types”
Annotations have an associated Sofa
Section 5.5.2, “Annotations have an associated Sofa”
AnnotationBase
Section 5.6, “AnnotationBase”
Multiple CAS Views of an Artifact
Chapter 6, Multiple CAS Views of an Artifact
CAS Views and Sofas
Section 6.1, “CAS Views and Sofas”
Naming CAS Views and Sofas
Section 6.1.1, “Naming CAS Views and Sofas”
Multi-View, Single-View components & applications
Section 6.1.2, “Multi/Single View parts in Applications”
Multi-View Components
Section 6.2, “Multi-View Components”
How UIMA decides if a component is Multi-View
Section 6.2.1, “Deciding: Multi-View”
Multi-View: additional capabilities
Section 6.2.2, “Multi-View: additional capabilities”
Component XML metadata
Section 6.2.3, “Component XML metadata”
Sofa Capabilities and APIs for Applications
Section 6.3, “Sofa Capabilities & APIs for Apps”
Sofa Name Mapping
Section 6.4, “Sofa Name Mapping”
Name Mapping in an Aggregate Descriptor
Section 6.4.1, “Name Mapping in an Aggregate Descriptor”
Name Mapping in a CPE
Descriptor
Section 6.4.2, “Name Mapping in a CPE
Descriptor”
Specifying the CAS View for a Single-View Component
Section 6.4.3, “CAS View for Single-View Parts”
???TITLE???
Section 6.4.3, “CAS View for Single-View Parts”
Name Mapping in a UIMA Application
Section 6.4.4, “Name Mapping in a UIMA Application”
Name Mapping for Remote Services
Section 6.4.5, “Name Mapping for Remote Services”
JCas extensions for Multiple Views
Section 6.5, “JCas extensions for Multiple Views”
Sample Multi-View Application
Section 6.6, “Sample Multi-View Application”
Annotator Descriptor
Section 6.6.1, “Annotator Descriptor”
Application Setup
Section 6.6.2, “Application Setup”
Annotator Processing
Section 6.6.3, “Annotator Processing”
Accessing the results of analysis
Section 6.6.4, “Accessing the results of analysis”
Views API Summary
Section 6.7, “Views API Summary”
Sofa Incompatibilities between UIMA version 1 and version 2
Section 6.8, “Sofa Incompatibilities: V1 and V2”
CAS Multiplier Developer's Guide
Chapter 7, CAS Multiplier Developer's Guide
Developing the CAS Multiplier Code
Section 7.1, “Developing the CAS Multiplier Code”
CAS Multiplier Interface Overview
Section 7.1.1, “CAS Multiplier Interface Overview”
How to Get an Empty CAS Instance
Section 7.1.2, “Getting an empty CAS Instance”
Example Code
Section 7.1.3, “Example Code”
Overall Structure
Section 7.1.3.1, “Overall Structure”
Initialize Method
Section 7.1.3.2, “Initialize Method”
Process Method
Section 7.1.3.3, “Process Method”
HasNext Method
Section 7.1.3.4, “HasNext Method”
Next Method
Section 7.1.3.5, “Next Method”
Creating the CAS Multiplier Descriptor
Section 7.2, “CAS Multiplier Descriptor”
Using a CAS Multiplier in an Aggregate Analysis Engine
Section 7.3, “Using CAS Multipliers in Aggregates”
Adding the CAS Multiplier to the Aggregate
Section 7.3.1, “Aggregate: Adding the CAS Multiplier”
CAS Multipliers and Flow Control
Section 7.3.2, “CAS Multipliers and Flow Control”
Aggregate CAS Multipliers
Section 7.3.3, “Aggregate CAS Multipliers”
Using a CAS Multiplier in a Collection Processing Engine
Section 7.4, “CAS Multipliers in CPE's”
Calling a CAS Multiplier from an Application
Section 7.5, “Applications: Calling CAS Multipliers”
Retrieving Output CASes from the CAS Multiplier
Section 7.5.1, “Output CASes”
Using a CAS Multiplier with other Analysis Engines
Section 7.5.2, “CAS Multipliers with other AEs”
Using a CAS Multiplier to Merge CASes
Section 7.6, “Merging with CAS Multipliers”
Overview of How to Merge CASes
Section 7.6.1, “CAS Merging Overview”
Example CAS Merger
Section 7.6.2, “Example CAS Merger”
Process Method
Section 7.6.2.1, “Process Method”
HasNext and Next Methods
Section 7.6.2.2, “HasNext and Next Methods”
Using the SimpleTextMerger in an Aggregate Analysis Engine
Section 7.6.3, “SimpleTextMerger in an Aggregate”
XMI and EMF Interoperability
Chapter 8, XMI and EMF Interoperability
Overview
Section 8.1, “Overview”
Converting an Ecore Model to or from a UIMA Type System
Section 8.2, “Converting an Ecore Model to or from a UIMA Type System”
Using XMI CAS Serialization
Section 8.3, “Using XMI CAS Serialization”