Apache Accumulo 1.5.4 Release Notes

Apache Accumulo 1.5.4 is one more bug-fix release for the 1.5 series. Like 1.5.3 before it, this release contains a very small changeset when considering the normal size of changes in a release.

This release contains no changes to the public API. As such, there are no concerns for the compatibility of user code running against 1.5.3. All users are encourage to upgrade immediately without concern of stability and compatibility.

A full list of changes is available via CHANGES.

We'd like to thank all of the committers and contributors which had a part in making this release, from code contributions to testing. Everyone's efforts are greatly appreciated.

Correctness Bugs

Silent data-loss via bulk imported files

A user recently reported that a simple bulk-import application would occasionally lose some records. Through investigation, it was found that when bulk imports into a table failed the initial assignment, the logic that automatically retries the imports was incorrectly choosing the tablets to import the files into. ACCUMULO-3967 contains more information on the cause and identification of the bug. The data-loss condition would only affect entire files. If records from a file exist in Accumulo, it is still guaranteed that all records within that imported file were successful.

As such, users who have bulk import applications using previous versions of Accumulo should verify that all of their data was correctly ingested into Accumulo and immediately update to Accumulo 1.5.4.

Thanks to Edward Seidl for reporting this bug to us!

Server-side auditing changes

Thanks to James Mello for reporting and providing the fixes to the following server-side auditing issues.

Incorrect audit initialization

It was observed that the implementation used to audit user API requests on Accumulo server processes was not being correctly initialized which caused audit messages to never be generated. This was rectified in ACCUMULO-3939.

Missing audit implementations

It was also observed that some server-side API implementations did not include audit messages which resulted in an incomplete historical picture on what operations a user might have invoked. The missing audits (and those that were added) are described in ACCUMULO-3946.

Testing

Each unit and functional test only runs on a single node, while the RandomWalk and Continuous Ingest tests run on any number of nodes. Agitation refers to randomly restarting Accumulo processes and Hadoop DataNode processes, and, in HDFS High-Availability instances, forcing NameNode fail-over.

OS Hadoop Nodes ZooKeeper HDFS High-Availability Tests
OSX 2.6.0 1 3.4.5 No Unit and Functional Tests
Centos 6.5 2.7.1 6 3.4.5 No Continuous Ingest and Verify (10B entries), Randomwalk (24hrs)