BookKeeper

For reliable BookKeeper service, you should deploy BookKeeper in a cluster.

Run from bookkeeper source

The version of BookKeeper that DistributedLog depends on is not the official opensource version. It is twitter's production version 4.3.4-TWTTR, which is available in https://github.com/twitter/bookkeeper. We are working actively with BookKeeper community to merge all twitter's changes back to the community.

The major changes in Twitter's bookkeeper includes:

  • BOOKKEEPER-670: Long poll reads and LastAddConfirmed piggyback. It is to reduce the tailing read latency.
  • BOOKKEEPER-759: Delay ensemble change if it doesn't break ack quorum constraint. It is to reduce the write latency on bookie failures.
  • BOOKKEEPER-757: Ledger recovery improvements, to reduce the latency on ledger recovery.
  • Misc improvements on bookie recovery and bookie storage.

To build bookkeeper, run:

  1. First checkout the bookkeeper source code from twitter's branch.
$ git clone https://github.com/twitter/bookkeeper.git bookkeeper
  1. Build the bookkeeper package:
$ cd bookkeeper
$ mvn clean package assembly:single -DskipTests

However, since bookkeeper-server is one of the dependency of distributedlog-service. You could simply run bookkeeper using same set of scripts provided in distributedlog-service. In the following sections, we will describe how to run bookkeeper using the scripts provided in distributedlog-service.

Run from distributedlog source

Build

First of all, build DistributedLog:

$ mvn clean install -DskipTests

Configuration

The configuration file bookie.conf under distributedlog-service/conf is a template of production configuration to run a bookie node. Most of the configuration settings are good for production usage. You might need to configure following settings according to your environment and hardware platform.

Port

By default, the service port is 3181, where the bookie server listens on. You can change the port to whatever port you like by modifying the following setting.

bookiePort=3181

Disks

You need to configure following settings according to the disk layout of your hardware. It is recommended to put journalDirectory under a separated disk from others for performance. It is okay to set indexDirectories to be same as ledgerDirectories. However, it is recommended to put indexDirectories to a SSD driver for better performance.

# Directory Bookkeeper outputs its write ahead log
journalDirectory=/tmp/data/bk/journal

# Directory Bookkeeper outputs ledger snapshots
ledgerDirectories=/tmp/data/bk/ledgers

# Directory in which index files will be stored.
indexDirectories=/tmp/data/bk/ledgers

To better understand how bookie nodes work, please check bookkeeper website for more details.

ZooKeeper

You need to configure following settings to point the bookie to the zookeeper server that it is using. You need to make sure zkLedgersRootPath exists before starting the bookies.

# Root zookeeper path to store ledger metadata
# This parameter is used by zookeeper-based ledger manager as a root znode to
# store all ledgers.
zkLedgersRootPath=/messaging/bookkeeper/ledgers
# A list of one of more servers on which zookeeper is running.
zkServers=localhost:2181

Stats Provider

Bookies use StatsProvider to expose its metrics. The StatsProvider is a pluggable library to adopt to various stats collecting systems. Please check Monitoring for more details.

# stats provide - use `codahale` metrics library
statsProviderClass=org.apache.bookkeeper.stats.CodahaleMetricsServletProvider

### Following settings are stats provider related settings

# Exporting codahale stats in http port `9001`
codahaleStatsHttpPort=9001

Index Settings

  • pageSize: size of a index page in ledger cache, in bytes. If there are large number of ledgers and each ledger has fewer entries, smaller index page would improve memory usage.
  • pageLimit: The maximum number of index pages in ledger cache. If nummber of index pages reaches the limitation, bookie server starts to swap some ledgers from memory to disk. Increase this value when swap becomes more frequent. But make sure pageLimit*pageSize should not be more than JVM max memory limitation.

Journal Settings

  • journalMaxGroupWaitMSec: The maximum wait time for group commit. It is valid only when journalFlushWhenQueueEmpty is false.
  • journalFlushWhenQueueEmpty: Flag indicates whether to flush/sync journal. If it is true, bookie server will sync journal when there is no other writes in the journal queue.
  • journalBufferedWritesThreshold: The maximum buffered writes for group commit, in bytes. It is valid only when journalFlushWhenQueueEmpty is false.
  • journalBufferedEntriesThreshold: The maximum buffered writes for group commit, in entries. It is valid only when journalFlushWhenQueueEmpty is false.

Setting journalFlushWhenQueueEmpty to true will produce low latency when the traffic is low. However, the latency varies a lost when the traffic is increased. So it is recommended to set journalMaxGroupWaitMSec, journalBufferedEntriesThreshold and journalBufferedWritesThreshold to reduce the number of fsyncs made to journal disk, to achieve sustained low latency.

Thread Settings

It is recommended to configure following settings to align with the cpu cores of the hardware.

numAddWorkerThreads=4
numJournalCallbackThreads=4
numReadWorkerThreads=4
numLongPollWorkerThreads=4

Run

As bookkeeper-server is shipped as part of distributedlog-service, you could use the dlog-daemon.sh script to start bookie as daemon thread.

Start the bookie:

$ ./distributedlog-service/bin/dlog-daemon.sh start bookie --conf /path/to/bookie/conf

Stop the bookie:

$ ./distributedlog-service/bin/dlog-daemon.sh stop bookie

Please check bookkeeper website for more details.