Package org.apache.uima.cas.impl

Implementation and Low-Level API for the CAS Interfaces.

See: Description

Package org.apache.uima.cas.impl Description

Implementation and Low-Level API for the CAS Interfaces.

Internal APIs. Use these APIs at your own risk. APIs in this package are subject to change without notice, even in minor releases. Use of this package is not supported. If you think you have found a bug in this package, please try to reproduce it with the officially supported APIs before reporting it.


Internals documentation

NOTE: This documentation is plain HTML, generated from a WYSIWIG editor "tinymce".   The way to work on this:  after setting up a small web page with the tinymce (running from a local file), use the Tools - source code to cut/paste between this file's source and that editor.

Java Cover Objects

It is possible to run UIMA without creating Java cover objects for FSs.  However, for convenience, many of the APIs return Java objects that provide, in turn, various APIs for accessing features, updating in indexes, etc.

There are two kinds of Java cover objects:

Both of these inherit from the FeatureStructure Interface.  Use of the JCas is optional; if the JCas cover classes are available (in the class path), they are used. 

UIMA Indexes

Indexes are defined for a pipeline, and are kept as part of the general CAS definition.

Each CAS View has its own instantiation of the defined indexes (there's one definition for all views), and as a result, a particular FS may be added-to-indexes and indexed in some views, and not in others.

There are 3 kinds of indexes: Sorted, Set, and Bag.  The basic object type for an index is FSLeafIndexImpl. This has 3 subtypes, one for each of the index types:

The leaf index is just for one type (and doesn't include entries for any subtypes).

Indexes are connected to specific index definitions; these definitions include a type which is the top type for elements of this index. The index definition logically includes that type and all of its subtypes.

An additional data struction, the IndexIteratorCachePair, is associated with each index definition.  It holds references to the subtype FSLeafIndexImpls for all subtypes of an index; this list is created lazily, only when an iterator is created over this index at a particular type level (which can be the type the index was defined for, or any subtype).  This lazy aspect is important, because UIMA is often used in cases where there's a giant type system, with lots of subtypes, only a few of which are used in a particular pipeline instance.

There are two tasks that indexes accomplish:

Iterators

There are two main kinds of iterators:

Iterators over UIMA indexes

There are two main kinds of iterators over UIMA indexes:

The basic int iterators are implemented with instances of the classes:

All of these implement an iterator over the corresponding FSLeafIndexImpl for one type.

The class PointerIterator in FSIndexRepositoryImpl is an int iterator that combines iterators for type and their subtypes, into one aggregated iterator, taking into account the comparator sorting order among the various iterators. So, for instance, if you do a moveTo operation, it does a move to on all the individual iterators, and then figures out which of those is the left-most one in the comparator ordering.

PointerIteratorUnordered is a variant that also combines iterators for a type and its subtypes, but doesn't try to keep these in order.  It is designed to be used when iterating through all instances of a type and its subtypes, in an arbitrary order, such as what the method getAllIndexedFS(type) does.

SnapshotPointerIterator is a variant which creates a snapshot when the iterator is created, and then (ignoring any subsequent index updates) iterates over that.  This iterator won't throw ConcurrentModificationException.

The basic impls of IntIterator4bag/set/sorted are created by calls to pointerIterator; this method is implemented in each of the IntIterator4bag/set/sorted classes.

The 2nd argument passed is a ref to this FSIndexRepositoryImpl's int[] used to detect concurrent modification exception.  If null is passed, then no testing for this is done.  This kind of call happens with the use of the refIterator() methods, which are used internally when it is known that the iteration will not be modifying the indexes in any way.


Iterators which return Java cover object:

 

Plain FSIterators are created from index instances via the iterator() method; corresponding int iterators are created from low-level indexes via ll_iterator(). This method picks the appropriate underlying iterator based on

Iterator Interfaces

There are several overlapping interfaces (probably due to historical reasons) for these iterators.

First, interfaces for iterators returning ints:

Next, interfaces for iterators returning Java cover objects:

 

 

 

Copyright © 2006–2017 The Apache Software Foundation. All rights reserved.