Apache Accumulo 1.7.1 Release Notes
Apache Accumulo 1.7.1 is a maintenance release on the 1.7 version branch. This release contains changes from more than 150 issues, comprised of bug-fixes, performance improvements, build quality improvements, and more. See JIRA for a complete list.
Users of any previous 1.7.x release are strongly encouraged to update as soon as possible to benefit from the improvements with very little concern in change of underlying functionality. Users of 1.6 or earlier that are seeking to upgrade to 1.7 should consider 1.7.1 as a starting point.
Highlights¶
Silent data-loss via bulk imported files¶
A user recently reported that a simple bulk-import application would occasionally lose some records. Through investigation, it was found that when bulk imports into a table failed the initial assignment, the logic that automatically retries the imports was incorrectly choosing the tablets to import the files into. ACCUMULO-3967 contains more information on the cause and identification of the bug. The data-loss condition would only affect entire files. If records from a file exist in Accumulo, it is still guaranteed that all records within that imported file were successful.
As such, users who have bulk import applications using previous versions of Accumulo should verify that all of their data was correctly ingested into Accumulo and immediately update to Accumulo 1.7.1 (This is the same bug that was fixed in 1.6.4, so you won't be affected if you're running 1.6.4 or newer).
Queued Compactions Not Running¶
Found and fixed a bug (ACCUMULO-4016) in which some queued compactions would never run if the number of files changed while the tablet was queued.
Kerberos Ticket Renewals¶
A bug was fixed which caused Accumulo clients and services to fail to check and (if necessary) renew their Kerberos credentials. This would eventually lead to these components failing to properly authenticate until they were restarted. (ACCUMULO-4069)
Updated commons-collection¶
The bundled commons-collection library was updated from version 3.2.1 to 3.2.2 because of a reported vulnerability in that library. (ACCUMULO-4056)
Faster Processing of Conditional Mutations¶
Improved ConditionalMutation processing time by a factor of 3. (ACCUMULO-4066)
Slow GC While Bulk Importing¶
Found and worked around an issue where lots of bulk imports creating many new files would significantly impair the Accumulo GC service, and possibly prevent it from running to completion entirely. (ACCUMULO-4021)
Unnoticed Per-table Configuration Updates¶
Fixed a bug which caused tablet servers to not notice changes to the per-table constraints, under some circumstances. (ACCUMULO-3859)
TabletServers kill themselves on CentOS7¶
Reduced the aggressiveness with which Accumulo Tablet Servers preemptively killed themselves when a local filesystem switched to read-only (indicating a possible failure). To reduce false positives, such as those which can occur with systemd's extra cgroup mounts in CentOS7, an additional check was added to ensure that tablet servers would only kill themselves if an ext- or xfs-formatted disk switched to read-only. (ACCUMULO-4080)
Improvements in Locating Client Configuration File¶
Fixed some unexpected error messages related to setting ACCUMULO_CLIENT_CONF_PATH, and improved the detection of the client.conf file if ACCUMULO_CLIENT_CONF_PATH was set to a directory containing client.conf. (ACCUMULO-4026,ACCUMULO-4027)
Transient ZooKeeper disconnect causes FATE threads to exit¶
ZooKeeper clients are expected to handle the situation where they become disconnected from the ZooKeeper server and must wait to be reconnected before continuing ZooKeeper operations.
The dedicated threads running inside the Accumulo Master process for FATE actions had the potential unexpectedly exit in this disconnected state. This caused a scenario where all future FATE-based operations would be blocked until the Accumulo Master process was restarted. (ACCUMULO-4060)
Incorrect management of certain Thrift RPCs¶
Accumulo relies on Apache Thrift to implement remote procedure calls between Accumulo services. Accumulo's use of Thrift uncovered an unfortunate situation where a special RPC (a "oneway" call) would leave unwanted data on the underlying Thrift connection. After this extra data was left on connection, all subsequent RPCs re-using that connection would fail with "out of sequence response" error messages. Accumulo would be left in a bad state until the mishandled connections were released or Accumulo services were restarted. (ACCUMULO-4065)
Other Notable Changes¶
- ACCUMULO-3509 Fixed some lock contention in TabletServer, preventing resource cleanup
- ACCUMULO-3734 Fixed quote-escaping bug in VisibilityConstraint
- ACCUMULO-4025 Fixed cleanup of bulk load fate transactions
- ACCUMULO-4098,ACCUMULO-4113 Fixed widespread misuse of ByteBuffer
Testing¶
Each unit and functional test only runs on a single node, while the RandomWalk and Continuous Ingest tests run on any number of nodes. Agitation refers to randomly restarting Accumulo processes and Hadoop Datanode processes, and, in HDFS High-Availability instances, forcing NameNode failover.
OS/Environment | Hadoop | Nodes | ZooKeeper | HDFS High-Availability | Tests |
---|---|---|---|---|---|
CentOS 7.1 w/Oracle JDK8 on EC2 (1 m3.xlarge, 8 d2.xlarge) | 2.6.3 | 9 | 3.4.6 | No | Random walk (All.xml) 24-hour run, saw ACCUMULO-3794 and ACCUMULO-4151. |
CentOS 7.1 w/Oracle JDK8 on EC2 (1 m3.xlarge, 8 d2.xlarge) | 2.6.3 | 9 | 3.4.6 | No | 21 hr run of CI w/ agitation, 23.1B entries verified. |
CentOS 7.1 w/Oracle JDK8 on EC2 (1 m3.xlarge, 8 d2.xlarge) | 2.6.3 | 9 | 3.4.6 | No | 24 hr run of CI w/o agitation, 23.0B entries verified; saw performance issues outlined in comment on ACCUMULO-4146. |
CentOS 6.7 (OpenJDK 7), Fedora 23 (OpenJDK 8), and CentOS 7.2 (OpenJDK 7) | 2.6.1 | 1 | 3.4.6 | No | All unit tests and ITs pass with -Dhadoop.version=2.6.1; Kerberos ITs had a problem with earlier versions of Hadoop |