Apache Accumulo Configuration Management

All accumulo properties have a default value in the source code. Properties can also be set in accumulo-site.xml and in zookeeper on per-table or system-wide basis. If properties are set in more than one location, accumulo will choose the property with the highest precedence. This order of precedence is described below (from highest to lowest):

table properties
Table properties are applied to the entire cluster when set in zookeeper using the accumulo API or shell. While table properties take precedent over system properties, both will override properties set in accumulo-site.xml

Table properties consist of all properties with the table.* prefix. Table properties are configured on a per-table basis using the following shell commmand:
system properties
System properties are applied to the entire cluster when set in zookeeper using the accumulo API or shell. System properties consist of all properties with a 'yes' in the 'Zookeeper Mutable' column in the table below. They are set with the following shell command:
If a table.* property is set using this method, the value will apply to all tables except those configured on per-table basis (which have higher precedence).

While most system properties take effect immediately, some require a restart of the process which is indicated in 'Zookeeper Mutable'.
accumulo-site.xml Accumulo processes (master, tserver, etc) read their local accumulo-site.xml on start up. Therefore, changes made to accumulo-site.xml must rsynced across the cluster and processes must be restarted to apply changes.

Certain properties (indicated by a 'no' in 'Zookeeper Mutable') cannot be set in zookeeper and only set in this file. The accumulo-site.xml also allows you to configure tablet servers with different settings.
Default All properties have a default value in the source code. This value has the lowest precedence and is overriden if set in accumulo-site.xml or zookeeper.

While the default value is usually optimal, there are cases where a change can increase query and ingest performance.

The 'config' command in the shell allows you to view the current system configuration. You can also use the '-t' option to view a table's configuration as below:

    $ ./bin/accumulo shell -u root
    Enter current password for 'root'@'ac13': ******

    Shell - Apache Accumulo Interactive Shell
    - version: 1.3.6
    - instance name: ac13
    - instance id: 4f48fa03-f692-43ce-ae03-94c9ea8b7181
    - type 'help' for a list of available commands
    root@ac13> config -t foo
    SCOPE    | NAME                                        | VALUE
    default  | table.balancer ............................ | org.apache.accumulo.server.master.balancer.DefaultLoadBalancer
    default  | table.bloom.enabled ....................... | false
    default  | table.bloom.error.rate .................... | 0.5%
    default  | table.bloom.hash.type ..................... | murmur
    default  | table.bloom.key.functor ................... | org.apache.accumulo.core.file.keyfunctor.RowFunctor
    default  | table.bloom.load.threshold ................ | 1
    default  | table.bloom.size .......................... | 1048576
    default  | table.cache.block.enable .................. | false
    default  | table.cache.index.enable .................. | false
    default  | table.compaction.major.everything.at ...... | 19700101000000GMT
    default  | table.compaction.major.everything.idle .... | 1h
    default  | table.compaction.major.ratio .............. | 1.3
    site     |    @override .............................. | 1.4
    system   |    @override .............................. | 1.5
    table    |    @override .............................. | 1.6
    default  | table.compaction.minor.idle ............... | 5m
    default  | table.compaction.minor.logs.threshold ..... | 3
    default  | table.failures.ignore ..................... | false

Configuration Properties

Properties in this category must be consistent throughout a cloud. This is enforced and servers won't be able to communicate if these differ.
PropertyTypeZookeeper MutableDefault ValueDescription
instance.dfs.dir absolute path no
HDFS directory in which accumulo instance will run. Do not change after accumulo is initialized.
instance.secret string no
A secret unique to a given instance that all servers must know in order to communicate with one another. Do not change after accumulo is initialized.
instance.zookeeper.host host list no
Comma separated list of zookeeper servers
instance.zookeeper.timeout duration no
Zookeeper session timeout; max value when represented as milliseconds should be no larger than 2147483647
Properties in this category affect the behavior of accumulo overall, but do not have to be consistent throughout a cloud.
PropertyTypeZookeeper MutableDefault ValueDescription
general.classpaths string no
A list of all of the places to look for a class. Order does matter, as it will look for the jar starting in the first location to the last. Please note, hadoop conf and hadoop lib directories NEED to be here, along with accumulo lib and zookeeper directory. Supports full regex on filename alone.
general.dynamic.classpaths string no
A list of all of the places where changes in jars or classes will force a reload of the classloader.
general.rpc.timeout duration no
Time to wait on I/O for simple, short RPC calls
Properties in this category affect the behavior of the master server
PropertyTypeZookeeper MutableDefault ValueDescription
master.logger.balancer java class yes
The balancer class that accumulo will use to make logger assignment decisions.
master.port.client port yes but requires restart
The port used for handling client connections on the master
master.recovery.max.age duration yes
Recovery files older than this age will be removed.
master.recovery.pool string yes
Priority queue to use for log recovery map/reduce jobs.
master.recovery.queue string yes
Priority queue to use for log recovery map/reduce jobs.
master.recovery.reducers count yes
Number of reducers to use to sort recovery logs (per log)
master.recovery.sort.mapreduce boolean yes
If true, use map/reduce to sort write-ahead logs during recovery
master.recovery.time.max duration yes
The maximum time to attempt recovery before giving up
master.tablet.balancer java class yes
The balancer class that accumulo will use to make tablet assignment and migration decisions.
Properties in this category affect the behavior of the tablet servers
PropertyTypeZookeeper MutableDefault ValueDescription
tserver.bloom.load.concurrent.max count yes
The number of concurrent threads that will load bloom filters in the background. Setting this to zero will make bloom filters load in the foreground.
tserver.cache.data.size memory yes
Specifies the size of the cache for file data blocks.
tserver.cache.index.size memory yes
Specifies the size of the cache for file indices.
tserver.client.timeout duration yes
Time to wait for clients to continue scans before closing a session.
tserver.compaction.major.concurrent.max count yes but requires restart
The maximum number of concurrent major compactions for a tablet server
tserver.compaction.major.delay duration yes
Time a tablet server will sleep between checking which tablets need compaction.
tserver.compaction.major.files.open.max count yes but requires restart
Max number of files a major compaction can open at once. At runtime this number is divided by the concurrent number of compactors.
tserver.compaction.minor.concurrent.max count yes
The maximum number of concurrent minor compactions for a tablet server
tserver.default.blocksize memory yes
Specifies a default blocksize for the tserver caches
tserver.dir.memdump path yes
A long running scan could possibly hold memory that has been minor compacted. To prevent this, the in memory map is dumped to a local file and the scan is switched to that local file. We can not switch to the minor compacted file because it may have been modified by iterators. The file dumped to the local dir is an exact copy of what was in memory.
tserver.files.open.idle duration yes
Tablet servers leave previously used map files open for future queries. This setting determines how much time an unused map file should be kept open until it is closed.
tserver.files.open.max count yes but requires restart
Maximum total map files that all tablets in a tablet server can open. This includes major compactions. So the number of map files that can be opened for searches is: tserver.files.open.max - tserver.compaction.major.files.open.max
tserver.logger.count count yes but requires restart
The number of loggers that each tablet server should use.
tserver.logger.strategy string yes
The classname used to decide which loggers to use.
tserver.logger.timeout duration yes
The time to wait for a logger to respond to a write-ahead request
tserver.memory.lock boolean yes
The tablet server must communicate with zookeeper frequently to maintain its locks. If the tablet server's memory is swapped out the java garbage collector can stop all processing for long periods. Change this property to true and the tablet server will attempt to lock all of its memory to RAM, which may reduce delays during java garbage collection. You will have to modify the system limit for "max locked memory". This feature is only available when running on Linux. Alternatively you may also want to set /proc/sys/vm/swappiness to zero (again, this is Linux-specific).
tserver.memory.manager java class yes
An implementation of MemoryManger that accumulo will use.
tserver.memory.maps.max memory yes
Maximum amount of memory all tablets in memory maps can use.
tserver.memory.maps.native.enabled boolean yes but requires restart
An in-memory data store for accumulo implemented in c++ that increases the amount of data accumulo can hold in memory and avoids Java GC pauses.
tserver.metadata.readahead.concurrent.max count yes
The maximum number of concurrent metadata read ahead that will execute.
tserver.migrations.concurrent.max count yes
The maximum number of concurrent tablet migrations for a tablet server
tserver.monitor.fs boolean yes
When enabled the tserver will monitor file systems and kill itself when one switches from rw to ro. This is usually and indication that Linux has detected a bad disk.
tserver.mutation.queue.max memory yes
The amount of memory to use to store write-ahead-log mutations-per-session before flushing them.
tserver.port.client port yes but requires restart
The port used for handling client connections on the tablet servers
tserver.port.search boolean yes
if the ports above are in use, search higher ports until one is available
tserver.readahead.concurrent.max count yes
The maximum number of concurrent read ahead that will execute. This effectivelylimits the number of long running scans that can run concurrently per tserver.
tserver.session.idle.max duration yes
maximum idle time for a session
tserver.tablet.split.midpoint.files.max count yes
To find a tablets split points, all index files are opened. This setting determines how many index files can be opened at once. When there are more index files than this setting multiple passes must be made, which is slower. However opening too many files at once can cause problems.
tserver.walog.max.size memory yes
The maximum size for each write-ahead log
Properties in this category affect the behavior of the write-ahead logger servers
PropertyTypeZookeeper MutableDefault ValueDescription
logger.archive boolean yes
determines if write-ahead logs are archived in hdfs
logger.copy.threadpool.size count yes
size of the thread pool used to copy files from the local log area to HDFS
logger.dir.walog path yes
The directory used to store write-ahead logs on the local filesystem
logger.monitor.fs boolean yes
When enabled the logger will monitor file systems and kill itself when one switches from rw to ro. This is usually and indication that Linux has detected a bad disk.
logger.port.client port yes but requires restart
The port used for write-ahead logger services
logger.port.search boolean yes
if the port above is in use, search higher ports until one is available
logger.sort.buffer.size memory yes
The amount of memory to use when sorting logs during recovery. Only used when *not* sorting logs with map/reduce.
Properties in this category affect the behavior of the accumulo garbage collector.
PropertyTypeZookeeper MutableDefault ValueDescription
gc.cycle.delay duration yes
Time between garbage collection cycles. In each cycle, old files no longer in use are removed from the filesystem.
gc.cycle.start duration yes
Time to wait before attempting to garbage collect any old files.
gc.port.client port yes but requires restart
The listening port for the garbage collector's monitor service
gc.threads.delete count yes
The number of threads used to delete files
Properties in this category affect the behavior of the monitor web server.
PropertyTypeZookeeper MutableDefault ValueDescription
monitor.port.client port no
The listening port for the monitor's http service
monitor.port.log4j port no
The listening port for the monitor's log4j logging collection.
Properties in this category affect the behavior of distributed tracing.
PropertyTypeZookeeper MutableDefault ValueDescription
trace.password string no
The password for the user used to store distributed traces
trace.port.client port no
The listening port for the trace server
trace.table string no
The name of the table to store distributed traces
trace.user string no
The name of the user to store distributed traces
Properties in this category affect tablet server treatment of tablets, but can be configured on a per-table basis. Setting these properties in the site file will override the default globally for all tables and not any specific table. However, both the default and the global setting can be overridden per table using the table operations API or in the shell, which sets the overridden value in zookeeper. Restarting accumulo tablet servers after setting these properties in the site file will cause the global setting to take effect. However, you must use the API or the shell to change properties in zookeeper that are set on a table.
PropertyTypeZookeeper MutableDefault ValueDescription
table.balancer string yes
This property can be set to allow the LoadBalanceByTable load balancer to change the called Load Balancer for this table
table.bloom.enabled boolean yes
Use bloom filters on this table.
table.bloom.error.rate fraction/percentage yes
Bloom filter error rate.
table.bloom.hash.type string yes
The bloom filter hash type
table.bloom.key.functor java class yes
A function that can transform the key prior to insertion and check of bloom filter. org.apache.accumulo.core.file.keyfunctor.RowFunctor,,org.apache.accumulo.core.file.keyfunctor.ColumnFamilyFunctor, and org.apache.accumulo.core.file.keyfunctor.ColumnQualifierFunctor are allowable values. One can extend any of the above mentioned classes to perform specialized parsing of the key.
table.bloom.load.threshold count yes
This number of seeks that would actually use a bloom filter must occur before a map files bloom filter is loaded. Set this to zero to initiate loading of bloom filters when a map file opened.
table.bloom.size count yes
Bloom filter size, as number of keys.
table.cache.block.enable boolean yes
Determines whether file block cache is enabled.
table.cache.index.enable boolean yes
Determines whether index cache is enabled.
table.compaction.major.everything.at date/time yes
This setting specifies a time at which all tablets in a table will major compact to one file, even tablets with only one file. When this settings specifies a time in the future, no action is taken. When the time is in the past any tablet having a map file older than the specified time will major compact to one file. The time specified must conform to the yyyyMMddHHmmssz pattern. See the Java SimpleDataFormat java doc for details about this pattern.
table.compaction.major.everything.idle duration yes
After a tablet has been idle (no mutations) for this time period it may have all of its map file compacted into one. There is no guarantee an idle tablet will be compacted. Compactions of idle tablets are only started when regular compactions are not running. Idle compactions only take place for tablets that have one or more map files.
table.compaction.major.ratio fraction/percentage yes
minimum ratio of total input size to maximum input file size for running a major compaction
table.compaction.minor.idle duration yes
After a tablet has been idle (no mutations) for this time period it may have its in-memory map flushed to disk in a minor compaction. There is no guarantee an idle tablet will be compacted.
table.compaction.minor.logs.threshold count yes
When there are more than this many write-ahead logs against a tablet, it will be minor compacted.
table.failures.ignore boolean yes
If you want queries for your table to hang or fail when data is missing from the system, then set this to false. When this set to true missing data will be reported but queries will still run possibly returning a subset of the data.
table.file.blocksize memory yes
Overrides the hadoop dfs.block.size setting so that map files have better query performance. The maximum value for this is 2147483647
table.file.compress.blocksize memory yes
Overrides the hadoop io.seqfile.compress.blocksize setting so that map files have better query performance. The maximum value for this is 2147483647
table.file.compress.type string yes
One of gz,lzo,none
table.file.replication count yes
Determines how many replicas to keep of a tables map files in HDFS. When this value is LTE 0, HDFS defaults are used.
table.file.type string yes
Change the type of file a table writes
table.groups.enabled string yes
A comma separated list of locality group names to enable for this table.
table.scan.cache.enable boolean yes
Determines whether scan cache is enabled.
table.scan.cache.size memory yes
Scan cache size.
table.scan.max.memory memory yes
The maximum amount of memory that will be used to cache results of a client query/scan. Once this limit is reached, the buffered data is sent to the client.
table.security.scan.visibility.default string yes
The security label that will be assumed at scan time if an entry does not have a visibility set.
Note: An empty security label is displayed as []. The scan results will show an empty visibility even if the visibility from this setting is applied to the entry.
CAUTION: If a particular key has an empty security label AND its table's default visibility is also empty, access will ALWAYS be granted for users with permission to that table. Additionally, if this field is changed, all existing data with an empty visibility label will be interpreted with the new label on the next scan.
table.split.threshold memory yes
When combined size of mapfiles exceeds this amount a tablet is split.
table.walog.enabled boolean yes
Use the write-ahead log to prevent the loss of data.
Properties in this category are per-table properties that add constraints to a table. These properties start with the category prefix, followed by a number, and their values correspond to a fully qualified Java class that implements the Constraint interface.
For example, table.constraint.1 = org.apache.accumulo.core.constraints.MyCustomConstraint and table.constraint.2 = my.package.constraints.MySecondConstraint
Properties in this category specify iterators that are applied at various stages (scopes) of interaction with a table. These properties start with the category prefix, followed by a scope (minc, majc, scan, etc.), followed by a period, followed by a name, as in table.iterator.scan.vers, or table.iterator.scan.custom. The values for these properties are a number indicating the ordering in which it is applied, and a class name such as table.iterator.scan.vers = 10,org.apache.accumulo.core.iterators.VersioningIterator
These iterators can take options if additional properties are set that look like this property, but are suffixed with a period, followed by 'opt' followed by another period, and a property name.
For example, table.iterator.minc.vers.opt.maxVersions = 3
Properties in this category are per-table properties that define locality groups in a table. These properties start with the category prefix, followed by a name, followed by a period, and followed by a property for that group.
For example table.group.group1=x,y,z sets the column families for a group called group1. Once configured, group1 can be enabled by adding it to the list of groups in the table.groups.enabled property.
Additional group options may be specified for a named group by setting table.group.<name>.opt.<key>=<value>.

Property Type Descriptions

Property TypeDescription


A non-negative integer optionally followed by a unit of time (whitespace disallowed), as in 30s.
If no unit of time is specified, seconds are assumed. Valid units are 'ms', 's', 'm', 'h' for milliseconds, seconds, minutes, and hours.
Examples of valid durations are '600', '30s', '45m', '30000ms', '3d', and '1h'.
Examples of invalid durations are '1w', '1h30m', '1s 200ms', 'ms', '', and 'a'.
Unless otherwise stated, the max value for the duration represented in milliseconds is 9223372036854775807


A date/time string in the format: YYYYMMDDhhmmssTTT where TTT is the 3 character time zone


A positive integer optionally followed by a unit of memory (whitespace disallowed), as in 2G.
If no unit is specified, bytes are assumed. Valid units are 'B', 'K', 'M', 'G', for bytes, kilobytes, megabytes, and gigabytes.
Examples of valid memories are '1024', '20B', '100K', '1500M', '2G'.
Examples of invalid memories are '1M500K', '1M 2K', '1MB', '1.5G', '1,024K', '', and 'a'.
Unless otherwise stated, the max value for the memory represented in bytes is 9223372036854775807

host list

A comma-separated list of hostnames or ip addresses, with optional port numbers.
Examples of valid host lists are 'localhost:2000,www.example.com,' and 'localhost'.
Examples of invalid host lists are '', ':1000', and 'localhost:80000'


An positive integer in the range 1024-65535, not already in use or specified elsewhere in the configuration


A non-negative integer in the range of 0-2147483647


A floating point number that represents either a fraction or, if suffixed with the '%' character, a percentage.
Examples of valid fractions/percentages are '10', '1000%', '0.05', '5%', '0.2%', '0.0005'.
Examples of invalid fractions/percentages are '', '10 percent', 'Hulk Hogan'


A string that represents a filesystem path, which can be either relative or absolute to some directory. The filesystem depends on the property.

absolute path

An absolute filesystem path. The filesystem depends on the property. This is the same as path, but enforces that its root is explicitly specified.

java class

A fully qualified java class name representing a class on the classpath.
An example is 'java.lang.String', rather than 'String'


An arbitrary string of characters whose format is unspecified and interpreted based on the context of the property to which it applies.


Has a value of either 'true' or 'false'