UIMA Python Annotators and the Pythonnator

What is a Python Annotator?

A Python Annotator is a UIMA annotator component written in Python that can be used within the UIMA SDK framework.

What is the Pythonnator?

The Pythonnator is the linkage between the UIMA framework and a Python Annotator. The Pythonnator is actually a UIMA C++ annotator which can be referenced by primitive annotator or CAS consumer descriptors. The descriptor must define one configuration parameter, a second is optional:

When the Pythonnator is initialized, e.g. at CPE initialization, the C++ code creates a Python interpreter, imports the specified script and calls the script's initialization method. Similarly, when other Pythonnator methods such as process() are called by the UIMA framework, the associated methods in the Python script are called.

The Pythonnator also provides a Python library implementing an interface between Python and the UIMA APIs of the UIMA C++ framework.

Supported Platforms

The Pythonnator has been tested with Python versions 2.3 and 2.4 on Linux, and with version 2.4 on Windows (from http://www.python.org/ftp/python/2.4.2/python-2.4.2.msi). There are errors with version 2.5 on Windows.

Prerequisites

The Pythonnator uses SWIG (http://www.swig.org/) to implement the Python library interface to UIMA. SWIG version 1.3.29 or later is required.

The UIMA C++ framework is required.

In addition to the Python interpreter, a Python development package (python-devel on Linux) is required for building the Pythonnator. The above mentioned Windows version includes a development package.

Pythonnator Distribution

Pythonnator code is distributed in source form and must be built on the target platform. A Makefile is supplied for Linux and a vcproj for Windows builds.

Pythonnator source and sample code is located in the $UIMACPP_HOME/scriptators directory.

Setting Environment Variables

The Pythonnator requires the standard environment for UIMA C++ components. In addition, PYTHONPATH must be set to identify where Python modules will be found.

Building and Installing the Pythonnator

Recursively copy the scriptators directory from the uimacpp distribution to a writable directory tree. CD to the writable scriptators/python directory.

On Linux

On Windows

Build results are the C++ annotator, _pythonnator.so on Linux or _pythonnator.dll on Windows, and the Python library interface to UIMA APIs, pythonnator.py.

If you have write access to UIMA C++ distribution tree, on Linux copy _pythonnator.so and pythonnator.py to $UIMACPP_HOME/lib and add this directory to PYTHONPATH. On Windows copy _pythonnator.dll and pythonnator.py to $UIMACPP_HOME/bin and add this directory to PYTHONPATH.

If you don't have write access, make sure that pythonnator.so|.dll is in the LD_LIBRARY_PATH or PATH, as appropriate, and copy pythonnator.py to the directory with the user's python modules and be sure to set PYTHONPATH to include that directory.

Testing the Pythonnator

A simple Python regular expression annotator sample.py with descriptor PythonSample.xml is included in the distribution. Make sure that sample.py is in PYTHONPATH and use the descriptor as with any other UIMA annotator descriptor.

Known Pythonnator Issues