UIMA Perl Annotators and the Perlator

What is a Perl Annotator?

A Perl Annotator is a UIMA annotator component written in PERL that can be used within the UIMA SDK framework.

What is the Perltator?

The Perltator is the linkage between the UIMA framework and a Perl Annotator. The Perltator is actually a UIMA C++ annotator which can be referenced by primitive annotator or CAS consumer descriptors. The descriptor must define one configuration parameter, a second is optional:

When the Perltator is initialized, e.g. at CPE initialization, the C++ code creates a PERL interpreter, sources the specified script and calls the script's initialization method. Similarly, when other Perltator methods such as process() are called by the UIMA framework, the subroutines of the same name in the PERL script are called.

The Perltator also provides a PERL library implementing an interface between PERL and the UIMA APIs of the UIMA C++ framework.

Supported Platforms

The Perltator has been tested with PERL version 5.8 on Linux and with ActivePerl-5.8.8.816-MSWin32-x86-255195.msi on Windows XP.

Prerequisites

The Perltator uses SWIG (http://www.swig.org/) to implement the PERL library interface to UIMA. SWIG version 1.3.29 or later is required.

The UIMA C++ framework is required.

Also necessary is the Perl development package. On some Unix platforms the dev kit is not included with the standard interpretor package. A Unix command line test for the dev kit is

which returns the gcc options necessary to link with the interpretor.

Perltator Distribution

Perltator code is distributed in source form and must be built on the target platform.

Perltator source and sample code is located in the $UIMACPP_HOME/scriptators directory.

Setting Environment Variables

The Perltator requires the standard environment for UIMA C++ components. In addition, the PERLLIB environment variable should point to the path to the perltator.pm file.

Building and Installing the Perltator

Recursively copy the scriptators directory from the uimacpp distribution to a writable directory tree. CD to the writable scriptators/perl directory.

On Linux

On Windows

Build results are the C++ annotator, perltator.so on Linux or perlator.dll on Windows, and the Perl module interface to UIMA APIs, perltator.pm.

If you have write access to UIMA C++ distribution tree, on Linux copy perltator.pm and perltator.so to $UIMACPP_HOME/lib and add this directory to PERLLIB. On Windows copy perltator.dll and perltator.pm to $UIMACPP_HOME/bin and add this directory to PERLLIB.

If you don't have write access, make sure that perlator.so|.dll is in the LD_LIBRARY_PATH or PATH, as appropriate, and that perltator.pm is in PERLLIB.

Testing the Perltator

A simple Perl regular expression annotator sample.pl with descriptor PerlSample.xml is included in the distribution. Perl annotators must be located in the environmental PATH, and on Linux the .pl files must be executable. Use the descriptor as with any other UIMA annotator descriptor.

Known Perltator Issues