Welcome to Apache BookKeeper™

The Apache BookKeeper subproject of ZooKeeper is made up of a distributed logging service called BookKeeper and a distributed publish/subscribe system built on top of BookKeeper called Hedwig.

What is Bookkeeper?

Bookkeeper is a replicated log service which can be used to build replicated state machines. A log contains a sequence of events which can be applied to a state machine. Bookkeeper guarantees that each replica state machine will see all the same entries, in the same order.

Eh? What good is that to me?

Imagine for example that you have a database that you want to be able access even if the database server goes down. You'll need to replicate it to multiple servers. You need to ensure that if one database sees an update, all databases see the update. But what happens if one database server is cut off from the network for a time? Or if two clients try to update the same field at exactly the same instance? This is where a replicated log comes in.

A database can be seen as a state machine. It is the sum of all the updates which is has applied since its initial state. Therefore, if you consider your replicated database as a replicated statemachine, you can do the replication using a replicated log service. If all updates are written to the log replication service before being applied to the database, then the database will continue to be available and consistent even if some of the replicas fail.

This approach can be applied to many types of distributed systems, such as messaging systems, coordination systems, filesystems, etc.

What Bookkeeper is not?

Bookkeeper has nothing to do with application/error/trace logging. There are already many projects dedicated to that problem.

Bookkeeper does not provide leader election. You'll need to use something like Zookeeper for that.

How about Hedwig?

Hedwig is a distributed publish and subscribe system, which uses bookkeeper to replicate its messages.

More information

Learn more about BookKeeper on the BookKeeper Wiki.
Learn more about Hedwig on the Hedwig Wiki.

Getting Involved

Apache ZooKeeper is an open source volunteer project under the Apache Software Foundation. It is a subproject of Hadoop. We encourage you to learn about the project and contribute your expertise. Here are some starter links:

  1. See our How to Contribute to ZooKeeper page.
  2. Give us feedback: What can we do better?
  3. Join the mailing list: Meet the community.