Bookie InternalsBookie server stores its data in multiple ledger directories and its journal files in a journal directory. Ideally, storing journal files in a separate directory than data files would increase throughput and decrease latency The Bookie JournalJournal directory has one kind of file in it:
Before persisting ledger index and data to disk, a bookie ensures that the transaction that represents the update is written to a journal in non-volatile storage. A new journal file is created using current timestamp when a bookie starts or an old journal file reaches its maximum size. A bookie supports journal rolling to remove old journal files. In order to remove old journal files safely, bookie server records LastLogMark in Ledger Device, which indicates all updates (including index and data) before LastLogMark has been persisted to the Ledger Device. LastLogMark contains two parts:
You may use following settings to further fine tune the behavior of journalling on bookies:
ZooKeeper MetadataFor BookKeeper, we require a ZooKeeper installation to store metadata, and to pass the list of ZooKeeper servers as parameter to the constructor of the BookKeeper class ( BookKeeper provides two mechanisms to organize its metadata in ZooKeeper. By default, the
Flat Ledger ManagerAll ledgers' metadata are put in a single zookeeper path, created using zookeeper sequential node, which can ensure uniqueness of ledger id. Each ledger node is prefixed with 'L'. Bookie server manages its owned active ledgers in a hash map. So it is easy for bookie server to find what ledgers are deleted from zookeeper and garbage collect them. And its garbage collection flow is described as below:
Hierarchical Ledger Manager
Since ZooKeeper sequential counter has a format of %10d -- that is 10 digits with 0 (zero) padding, i.e. "<path>0000000001",
These 3 parts are used to form the actual ledger node path used to store ledger metadata:
E.g. Ledger 0000000001 is split into 3 parts 00, 0000, 00001, which is stored in znode /{ledgers_root_path}/00/0000/L0001. So each znode could have at most 10000 ledgers, which avoids the problem of the child list being larger than the maximum ZooKeeper packet size. Bookie server manages its active ledgers in a sorted map, which simplifies access to active ledgers in a particular (level1, level2) partition. Garbage collection in bookie server is processed node by node as follows:
|