Metastore Interface

Although Apache BookKeeper provides LedgerManager and Hedwig Metadata Managers for users to plugin different metadata storages for both BookKeeper and Hedwig, it is quite difficult to implement a correct and efficient manager version based on the knowledge for both projects. The MetaStore interface extracts the commonality of the metadata storage interfaces and is provided for users to focus on adapting the underlying storage itself w/o having to worry about the detailed logic for BookKeeper and Hedwig.

MetaStore

The MetaStore interface provide users with access to MetastoreTable__s used for BookKeeper and Hedwig metadata management. There are two kinds of table defined in a MetaStore, MetastoreTable which provides basic PUT,__GET,__REMOVE__,__SCAN__ operations and which does not assume any ordering requirements from the underlying storage; and MetastoreScannableTable which is derived from MetastoreTable, but does assume that data is stored in key order in the underlying storage.

  • getName: Return the name of the MetaStore.
  • getVersion: Return current MetaStore plugin version.
  • init: Initialize the MetaStore library with the given configuration and its version.
  • close: Close the MetaStore, freeing all resources. i.e. release all the open connections and occupied memory etc.
  • createTable: Create a table instance to access the data stored in it. A table name is given to locate the table. An MetastoreTable object is returned.
  • createScannableTable: Similar as createTable, but returns MetastoreScannableTable rather then MetastoreTable object. If the underlying table is not an ordered table, MetastoreException should be thrown.

MetaStore Table

MetastoreTable is a basic unit in a MetaStore, which is used to handle different types of metadata, i.e. A MetastoreTable is used to store metadata for ledgers, while the other MetastoreTable is used to store metadata for topic persistence info. The interface for a MetastoreTable is quite simple:

  • get: Retrieve a entry by a given key. OK and its current version in metadata storage is returned when succeed. NoKey returned for a non-existent key. If fields are specified, return only the specified fields for the key.
  • put: Put the given value associated with key with given version. The value is only updated when the given version equals the current version in metadata storage. A new version should be returned when updated successfully. NoKey is returned for a non-existent key, BadVersion is returned when an update is attempted with a version which does not match the one in the metadata store.
  • remove: Remove the given value associated with key. The value is only removed when the given version equals its current version in metadata storage. NoKey is returned for a non-existent key, BadVersion is returned when remove is attempted with a version which does not match.
  • openCursor: Open a cursor to iterate over all the entries of a table. The returned cursor doesn't need to guarantee any order and transaction.

MetaStore Scannable Table

MetastoreScannableTable is identical to a MetastoreTable except that it provides an addition interface to iterate over entries in the table in key order.

  • openCursor: Open a cursor to iterate over all the entries of a table between the key range of firstKey and lastKey.

How to organize your metadata.

Some metadata in Hedwig and BookKeeper does not need to be stored in the order of the ledger id or the topic. You could use kind of hash table to store metadata for them. These metadata are topic ownership and topic persistence info. Besides that, subscription state and ledger metadata must be stored in key order due to the current logic in Hedwig/BookKeeper.