Apache UIMA (Unstructured Information Management Architecture) v2.3.0 ----------------------------------------------------------------------- What's New in 2.3.0 ------------------- Java 5 Generification --------------------- Many of the core components have been updated to include a Generic Type. This mainly affects the various indexing and iterating classes and interface APIs. Previous code using the non-generic APIs should continue to work. Components ---------- UIMA-AS was moved out of the Sandbox to the top level. It no longer is a wholly contained binary download, but requires downloading the base UIMA as well. The Cas Editor was moved out of the Sandbox and changed into an Eclipse plugin that is now a fully integrated tool within the main UIMA project. Several components were added to the Sandbox release - see the README for that for details. The classes supporting ecore were moved out of the core and into the examples, to allow the core and runtime modules to not require the ecore support bundles. A readme was added to inform users how to set up their installation to make use of the ecore functionality. Running PEARs as components in a pipe-line was improved: parameters, performance reporting, CAS Multipliers are all now supported. The OpenNLP wrappers were removed from the examples project, because they were out of date with the OpenNLP project, and that project has now produced UIMA wrappers - so they should be obtained from there. Performance ----------- Many improvements to core UIMA were made reducing the framework overhead, and reducing its footprint, as a result of careful measurement of its use in a very large scaled-out project needing low latency. Communication between Clients and Remote Services now use a "Delta CAS" protocol, where only the changes from the Remote Service are sent back to the client. A new and more efficient form of serialization for CASes going to and from Remote Services via UIMA-AS, called binary serialization, is available. Robustness ---------- UIMA-AS has had many enhancements to handling unusual and error situations, based on feedback from the initial release. Tools ----- A new Eclipse tool, the CasEditor, allows users to work with CASes. A new UimaBootstrap launcher can be used to simplify class paths; besides directories of .class files and .jar files, it can also take directories of .jar files and adds all the jars found in the directory to the classpath. Sources for Jars in Maven ------------------------- If you use Maven, you will find that the UIMA Jars now have Source Jar attachments, and the mvn eclipse:eclipse command will attach these sources so that Eclipse can find the source files that correspond to the Jars. Potential Incompatibilities --------------------------- The default name generated by the XmiWriterCasConsumer in the examples has been changed to include the ".xmi" suffix. Supported Platforms -------------------- Apache UIMA requires Java level 1.5; it has been tested with Sun Java SDK v5.0 and v6.0. Running the Eclipse plugin tooling for UIMA requires you start Eclipse using a Java 1.5 or later, as well. The supported platforms are: Windows, Linux, Solaris, AIX and Mac OS X. Other platforms and Java (1.5+) implementations should work, but have not been significantly tested. Many of the scripts in the /bin directory invoke Java. They use the value of the environment variable, JAVA_HOME, to locate the Java to use; if it is not set, they invoke "java" expecting to find an appropriate Java in your PATH. Environment Variables ---------------------- After you have unpacked the Apache UIMA distribution from the package of your choice (e.g. .zip or .gz), perform the steps below to set up UIMA so that it will function properly. * Set JAVA_HOME to the directory of your JRE installation you would like to use for UIMA. * Set UIMA_HOME to the apache-uima directory of your unpacked Apache UIMA distribution * Append UIMA_HOME/bin to your PATH * Please run the script UIMA_HOME/bin/adjustExamplePaths.bat (or .sh), to update paths in the examples based on the actual UIMA_HOME directory path. This script runs a Java program; you must either have java in your PATH or set the environment variable JAVA_HOME to a suitable JRE. Note: The Mac OS X operating system procedures for setting up global environment variables are described here: see http://developer.apple.com/qa/qa2001/qa1067.html. Verifying Your Installation ---------------------------- To test the installation, run the documentAnalyzer.bat (or .sh) file located in the bin subdirectory. This should pop up a "Document Analyzer" window. Set the values displayed in this GUI to as follows: * Input Directory: UIMA_HOME/examples/data * Output Directory: UIMA_HOME/examples/data/processed * Location of Analysis Engine XML Descriptor: UIMA_HOME/examples/descriptors/analysis_engine/PersonTitleAnnotator.xml Replace UIMA_HOME above with the path of your Apache UIMA installation. Next, click the "Run" button, which should, after a brief pause, pop up an "Analyzed Results" window. Double-click on one of the documents to display the analysis results for that document. Getting Started ---------------- For an introduction to Apache UIMA and how to use it, please read the documentation located in the docs subdirectory. A good place to start is the overview_and_setup book's first chapter, which has a brief guide to the documentation. Disclaimer ----------- Apache UIMA is an effort undergoing incubation at The Apache Software Foundation (ASF). Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.