BT

Your opinion matters! Please fill in the InfoQ Survey!

CockroachDB: A Scalable, Geo-Replicated, Transactional Datastore

| by Benjamin Darfler  Followers on Aug 29, 2014. Estimated reading time: 1 minute |

A note to our readers: As per your request we have developed a set of features that allow you to reduce the noise, while not losing sight of anything that is important. Get email and web notifications by choosing the topics you are interested in.

The team behind CockroachDB, an open source datastore project, has recently announced its initial alpha version. Inspired by Google’s Spanner project, CockroachDB aims to address the current lack of an open source, scalable, geo-replicated, ACID compliant database.

Traditionally, NoSQL datastores have supported multi datacenter configurations through the use of asynchronous replication between datacenters. This is the recommended approach in MongoDB and it is the only approach available in Riak, Cassandra and HBase. However, asynchronous replication prevents consistency guarantees between datacenters. Moreover, most of the first generation NoSQL datastores failed to support arbitrary cross key transactions. Recently, new systems such as FoundationDB and HyperDex have been designed with cross key transaction support. However, these systems caution against multi datacenter configurations.

Against that backdrop Google announced their Spanner system in 2012 which broke new ground supporting both transactions and multi datacenter consistency. While the Spanner paper inspired many, it wasn’t until the CockroachDB team began on their implementation that an open source project tried to replicate Spanner’s challenging mix of features.

In the draft specification, Spencer Kimball, who worked on the Colossus file system at Google, describes CockroachDB as supporting

ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. … Cockroach implements a single, monolithic sorted map from key to value where both keys and values are byte strings.

CockroachDB’s transactional features are provided in two ways. Kimball explains that

If all keys affected by a logical mutation fall within the same range, atomicity and consistency are guaranteed by Raft; this is the fast commit path. Otherwise, a non-locking distributed commit protocol is employed between affected ranges. Cockroach provides snapshot isolation (SI) and serializable snapshot isolation (SSI) semantics … SSI is the default isolation; clients must consciously decide to trade correctness for performance.

The non-locking distributed commit protocol is still a work in progress but Kimball references papers from The University of Sidney, Yale, and draws special attention to a paper from Yahoo! as potential options for implementing SSI semantics.

The current alpha version provides a very minimal subset of the desired functionality. Cluster initialization and joining, gossip network, and a basic key-value REST API are currently supported but there is no raft consensus, range splitting or transactions. The project is written in Go and they are currently looking for contributors.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Atomic clocks by Igor Kolomiets

Since the project is inspired by Google's Spanner one question begs to be asked - is there need for atomic clocks?

Re: Atomic clocks by Benjamin Darfler

Thats a great question. The short answer is no, it does not need atomic clocks. CockroachDB maintains slightly weaker semantics than Spanner and can therefore get away without them.

One example of this can be seen in how the two systems deal with out of band communication (e.g. communicating via a message queue). Take the situation in which one client writes a value, then sends a message to another client which in turn updates the same value. In Spanner there is no way for the second write to occur before the first write because the atomic clocks allow for a system wide real time ordering. CockroachDB does not provide this strong guarantee and instead passes a wait time value back to the client so that it can wait an appropriate amount of time if it will be communicating out of band.

More can be read about this in the CockroachDB spec.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

2 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT