2. Major Changes in this Release
3. List of Issues Fixed in this Release
8. More Documentation on Apache UIMA C++
Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. UIMA is a framework and SDK for developing such applications. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at. UIMA enables such an application to be decomposed into components, for example "language identification" -> "language specific segmentation" -> "sentence boundary detection" -> "entity detection (person/place names etc.)". Each component must implement interfaces defined by the framework and must provide self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them. Components are written in Java or C++; the data that flows between components is designed for efficient mapping between these languages. UIMA additionally provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes.
Apache UIMA is an Apache-licensed open source implementation of the UIMA specification (that specification is, in turn, being developed concurrently by a technical committee within OASIS , a standards organization). We invite and encourage you to participate in both the implementation and specification efforts.
UIMA is a component framework for analysing unstructured content such as text, audio and video. It comprises an SDK and tooling for composing and running analytic components written in Java and C++, with some support for Perl, Python and TCL.
This section describes what has changed between version 2.3.0 and version
2.4.0 of UIMA C++.
Summary |
|
Build
one source distribution which includes Windows and Linux files |
|
Cleanup
the Linux source distribution |
|
Migrate
UIMA C++ service wrapper to ACTIVEMQ CPP 3.4.1 |
|
BasicArrayFS
has two unimplemented Functions: copyToArray, copyFromArray. |
|
Changes to standardize UIMA C++ build and packaging on
Linux |
|
UIMA C++ service wrapper is not correctly shutting down
when Java controller terminates |
|
UIMA CPP aggregate AE incorrect handling of sofa mapping |
|
TypeSystemDescription element missing when Aggregate AE metadata
is serialized to XML |
|
Enable failover protocol support in UIMA C++ service
wrapper |
|
Replace usage of ActiveMQ CPP utitlity APIs with APR
functions |
|
Changes for GCC 4.3+ compatibilty and header file
conventions |
|
XMI serialization incorrectly handling string feature set
to an empty string |
|
Cleanup Linux source distribution |
|
Build one source distribution that includes Linux and
Windows files. |
|
Augment
UIMACpp binary lic/notice with appropriate items from other embedded
binaries. |
|
Build
script for uimacpp sdk on Windows does not correctly copy scriptators docs
and xerces libs.
|
|
Fix
warnings generated when creating the UIMA C++ doxygen docs.
|
|
UIMA
C++ fails to build with gcc 4.5.2. |
|
Add
APR 1.4.x to list of acceptable version to build UIMA C++ |
|
Add
APR-Util libraries to the UIMA C++ binary package |
|
UIMACPP
build fails on Mac OS X stricmp unavailable replace with strcasecmp in
deployCppService.hpp |
|
Xerces exception messages not being
properly converted to native code-page. |
|
Scriptator makefiles modified to work with different newer versons of Python and SWIG. |
|
APR-iconv missing from UIMA C++ Windows binary build |
There are two distinct features of UIMA C++ to consider when dealing with release compatibility:
and are built with some version of the SDK. A possible scenario is for an application to run annotators that
were built with different releases of UIMA C++ SDK.
Binary compatibility therefore depends on the compatibility of these underlying libraries. In particular,
ICU and XERCES encode the major and minor release numbers in the APIs which restricts binary compatibility across
releases of these libraries.
in a process and all annotators and underlying libraries must use the same ICU version.
We do not enforce binary compatibility when doing a release. Migrating to a new version of UIMA C++ may require rebuild of the annotators.
Installing UIMACPP SDK as a system-wide shared library is discouraged since we do not
have support for parallel versions. The include directory does not have version number and
there cannot be multiple versions of executables runAECpp and deployCppService.
The following are known open issues:
The Apache UIMA project really needs and appreciates any contributions, including documentation help, source code and feedback. If you are interested in contributing, please visit http://uima.apache.org/get-involved.html.
The Apache UIMA project uses JIRA for issue tracking. Please report any issues you find at http://issues.apache.org/jira/browse/uima
Please see Overview and Setup for a high level overview of UIMA C++, and Doxygen docs for details on the UIMA C++ APIs.