Ignite Native Persistence

As it is covered in memory-centric storage, Ignite is widely used as a caching layer (aka. data grid) above an existing 3rd party database such as RDBMS, Apache Cassandra or MongoDB. This mode is used to accelerate the underlying database that persists the data. At the same time Ignite comes with its own persistence, that is considered as an alternate and preferable persistence layer for the Ignite cluster.

Ignite native persistence is a distributed, ACID, and SQL-compliant disk store that transparently integrates with Ignite's durable memory. Ignite persistence is optional and can be turned on and off. When turned off Ignite becomes a pure in-memory store.

Apache Ignite Native Persistence - Distributed SQL Database

With the the native persistence enabled, Ignite always stores a superset of data on disk, and as much as possible in RAM. For example, if there are 100 entries and RAM has the capacity to store only 20, then all 100 will be stored on disk and only 20 will be cached in RAM for better performance.

The native persistence has the following important characteristics:

SQL queries over the full data set that spans both, memory and disk. This means that Apache Ignite can be used as a memory-centric distributed SQL database.
No need to have all the data in memory. Ignite persistence allows storing a superset of data on disk and only most frequently used subsets in memory.
Instantaneous cluster restarts. Ignite becomes fully operational from disk immediately upon cluster startup or restart. There is no need to preload or warm up the in-memory caches. The data will be loaded in-memory lazily, as it gets accessed.
Data and indexes are stored in a similar format both in memory and on disk, which helps avoid expensive transformations when moving data between memory and disk.
Ability to create full and incremental cluster snapshots by plugging-in 3rd party solutions.

Write-Ahead Log

Every time the data is updated in memory, the update will be appended to the tail of the write-ahead log (WAL). The purpose of the WAL is to propagate updates to disk in the fastest way possible and provide a consistent recovery mechanism that supports full cluster failures.

The whole WAL is split into several files, called segments, that are filled out sequentially. Once a segment is full, its content will be copied to the WAL archive where it will be preserved for a configurable amount of time. While the segment is being copied, another segment will be treated as an active WAL file.

The cluster can always be recovered up to the latest successfully committed transaction.

Checkpointing

As WAL grows, it periodically gets checkpointed to the main storage. The checkpointing is the process of copying dirty pages from memory to the partition files on disk. A dirty page is a page that was updated in memory, was appended to WAL, but was not written to a respective partition file on disk yet.

Durability

Ignite native persistence provides ACID durability guarantees to the data:

Committed transactions will always survive any failures.
The cluster can always be recovered to the latest successfully committed transaction.
The cluster restarts are very fast.

Configuration

To enable Ignite persistence, add the following configuration parameter to the cluster's node configuration: