A Perl Annotator is a UIMA annotator component written in PERL that can be used within the UIMA SDK framework.
The Perltator is the linkage between the UIMA framework and a Perl Annotator. The Perltator is actually a UIMA C++ annotator which can be referenced by primitive annotator or CAS consumer descriptors. The descriptor must define one configuration parameter, a second is optional:
SourceFile
(mandatory) - a string holding the name of the Perl module to run, andDebugLevel
(optional) - an integer value that specifies the debug level for tracing. Default value is 0. A value of 101 turns on Perltator tracing. Values 1-100 are reserved for annotator developer use.
The Perltator also provides a PERL library implementing an interface between PERL and the UIMA APIs of the UIMA C++ framework.
The Perltator has been tested with PERL version 5.8 on Linux and with ActivePerl-5.8.8.816-MSWin32-x86-255195.msi on Windows XP.
The Perltator uses SWIG (http://www.swig.org/) to implement the PERL library interface to UIMA. SWIG version 1.3.29 or later is required.
The UIMA C++ framework is required.
Also necessary is the Perl development package. On some Unix platforms the dev kit is not included with the standard interpretor package. A Unix command line test for the dev kit is
perl -MExtUtils::Embed -e ccopts
Perltator code is distributed in source form and must be built on the target platform.
Perltator source and sample code is located in the $UIMACPP_HOME/scriptators directory.
The Perltator requires the standard environment for UIMA C++ components. In addition, the PERLLIB environment variable should point to the path to the perltator.pm file.
Check that you have the required Perl and Swig packages installed
make
UIMA C++ uses Apache APR as the platform portability library. There is an incompatibility between APR and ActivePerl typedefs which must be resolved by editing the ActivePerl header win32.h. Change uid_t and gid_t from "long" to "int".
Modify winmake.cmd to set the paths for your Perl and Swig installs
winmake
Build results are the C++ annotator, perltator.so on Linux or perlator.dll on Windows, and the Perl module interface to UIMA APIs, perltator.pm.
If you have write access to UIMA C++ distribution tree, on Linux copy perltator.pm and perltator.so to $UIMACPP_HOME/lib and add this directory to PERLLIB. On Windows copy perltator.dll and perltator.pm to $UIMACPP_HOME/bin and add this directory to PERLLIB.
If you don't have write access, make sure that perlator.so|.dll is in the LD_LIBRARY_PATH or PATH, as appropriate, and that perltator.pm is in PERLLIB.
sample.pl
with descriptor PerlSample.xml
is included in the distribution. Perl annotators must be located in the environmental PATH, and on Linux the .pl files must be executable. Use the descriptor as with any other UIMA annotator descriptor.