Five short years ago the term NoSQL
was just beginning to take off, many NoSQL databases were pre version 1.0 and when, it came to the CAP tradeoff, choosing availability over consistency was in vogue. ACID was seen as an albatross to be tossed off and BASE was the way of the future. Fast forward to today and the community has matured, some of the hype has died down and distributed, fault tolerant transactions are moving into the fore as a new round of NoSQL databases seek to redefine our NoSQL expectations.
The march towards distributed, fault tolerant transactions started in ernest in the fall of 2012 when Google announced their Spanner database. Spanner is a globally distributed, fault tolerant, transactional NoSQL database; a collection of properties that had previously seemed to be a contradiction in terms. However, Google broke that misconception by announcing that they had had such a database in production for over a year.
A few months after Google’s announcement, the HyperDex team quietly announced their Warp add-on which brought distributed, fault tolerant transactions to HyperDex. This marked the first readily available, open source implementation of such transactions. The tide was starting to shift but there was still a ways to go.
In May of 2013 Kyle Kingsbury gave his Jepsen talk at RICON. In his talk, Kingsbury exposed a wide range of weaknesses in various NoSQL databases under a variety of failure conditions. Even databases such as MongoDB and Redis, both of which are typically categorized as consistent databases, failed to live up to their promises. Kingsbury’s work pushed the community to focus on testing and formal designs and to better understand the tradeoffs when availability is chosen over consistency.
Amidst this concern over consistency, FoundationDB launched version 1.0 of their key-value store which marked the first proprietary NoSQL database with distributed, fault tolerant transactions. The FoundationDB team understood the necessity for rigorous testing and spoke passionately about Lithium, their testing framework, which enabled them to stand behind their ACID guarantees. Later they would run Jepsen against FoundationDB without any surprises.
More recently, 2014 saw the start of not one but two open source NoSQL databases with the explicit design goal of supporting distributed, fault tolerant transactions. Work on CockroachDB, which attempts to replicate Spanner’s geo-replicated transactions, got underway early in the year, while Treode released its initial 0.1 version in June. Both of these projects take formal design seriously and are pulling from a wide range of academic work in distributed systems.
The impact of these transactional databases is already being felt in the NoSQL world as we see consistent
NoSQL databases being driven to improve the strength of their guarantees. For instance there is increasing pressure on Redis to focus on formal designs and testing as it builds out its distributed capabilities. More recently, Tokutek released its new Ark consensus algorithim for MonogoDB. Ark is based on the Raft protocol and aims to fix known consistency issues in MongoDB.
While it is still very early days for Treode and CockroachDB there are already a number of companies using FoundationDB and HyperDex Warp in production. Distributed, fault tolerant transactions are here to stay in NoSQL and we will only see their impact on the landscape grow.