Apache Accumulo Documentation : Isolation

Scanning

Accumulo supports the ability to present an isolated view of rows when scanning. There are three possible ways that a row could change in accumulo :

Isolation guarantees that either all or none of the changes made by these operations on a row are seen. Use the IsolatedScanner to obtain an isolated view of an accumulo table. When using the regular scanner it is possible to see a non isolated view of a row. For example if a mutation modifies three columns, it is possible that you will only see two of those modifications. With the isolated scanner either all three of the changes are seen or none. For an example of this try running the InterferenceTest example.

At this time there is no client side isolation support for the BatchScanner. You may consider using the WholeRowIterator with the BatchScanner to achieve isolation though. This drawback of doing this is that entire rows are read into memory on the server side. If a row is too big, it may crash a tablet server. The IsolatedScanner buffers rows on the client side so a large row will not crash a tablet server.

Iterators

When writing server side iterators for accumulo isolation is something to be aware of. A scan time iterator in accumulo reads from a set of data sources. While an iterator is reading data it has an isolated view. However, after it returns a key/value it is possible that accumulo may switch data sources and re-seek the iterator. This is done so that resources may be reclaimed. When the user does not request isolation this can occur after any key is returned. When a user request isolation this will only occur after a new row is returned, in which case it will re-seek to the very beginning of the next possible row.