Apache UIMA-DUCC (Unstructured Information Management Architecture - Distributed UIMA Cluster Computing ) v.2.2.0 Release Notes
Contents
1. What is UIMA-DUCC?
2. Major Changes in this Release
3. Migration from a Prior Release
4. Limitations
DUCC stands for Distributed UIMA Cluster Computing. DUCC is a cluster management system providing tooling,
management, and scheduling facilities to automate the scale-out of applications written to the UIMA framework.
Core UIMA provides a generalized framework for applications that process unstructured information such as human
language, but does not provide a scale-out mechanism. UIMA-AS provides a scale-out mechanism to distribute UIMA
pipelines over a cluster of computing resources, but does not provide job or cluster management of the resources.
DUCC defines a formal job model that closely maps to a standard UIMA pipeline. Around this job model DUCC
provides cluster management services to automate the scale-out of UIMA pipelines over computing clusters.
Apache UIMA DUCC 2.2.0 is a major release containing new features and bug fixes. What's new:
- Ships with the latest UIMA-AS v2.9.0 and UIMA SDK 2.9.0
- Ships with ActiveMQ v5.14.0
- Added support for static failover and capability to move the head node
- Fixed DUCC OR "warm" restart issues
- Fixed DUCC startup script to fail if the DB doesn't start
- Fixed DUCC shutdown sequence bug which prevented agents from stopping if Broker was shutdown first
- Fixed Rogue process detector to detect and cleanup orphan services
- Deprecated ducc.agent.node.metrics.sys.gid.max and replaced with ducc.agent.rogue.process.sys.uid.max
- Enhanced DUCC Job Driver (JD) to provide individual work item performance breakdowns
- Modified DUCC to restrict broker use to ducc user only
- On process launch failure, agent supplies a reason for failure for display in ducc mon
- Added duplicate daemon detector to prevent starting duplicate DUCC daemon
- Many DUCC Database improvements
- Many DUCC webpage improvements
For a complete list of issues fixed and up-to-date information on UIMA-DUCC issues, see our issue tracker:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20UIMA%20AND%20fixVersion%20%3D%20%222.2.0-Ducc%22%20
An existing DUCC installation can be updated in place by using the ducc_update
script which can be copied from the
UIMA Downloads page
or extracted from the binary distribution.
Additional steps are required to convert existing history and state files for database access.
Details are in the INSTALL document and the DuccBook.
On some systems cgroups swap accounting is not enabled and duccmon will show N/A for swap. To
confirm, please check memory.stat file in /ducc/ folder. If swap accounting is
enabled there should be "swap" property defined. If it's missing, you need to add a kernel parameter
swapaccount=1. Details of how to do this can be found here.