Google Releases the High Replication Datastore for App Engine
Distributed, infinitely scalable and highly reliable datastores are the next holly grail of our industry. Google took a first stab at the problem when it launched two years ago the Google App Engine Datastore. Its Master/Slave replication architecture was designed to support "quick, strongly consisten reads", while enabling fast writes that are immediately available. Yet, Google had to revisit this approach:
For the past six months, as you are probably aware, we’ve been struggling with some reliability issues with the App Engine Datastore. Over the course of the past few months, we’ve made major strides in fixing these issues. However, our experience with these issues has made us rethink some of our design assumptions.
Google introduced last week the "High Replication Datastore" to provide a higher level of availability for both read and writes. This was achieved at the expense of increased latency for writes and changes in consistency guarantees in the API.
The High Replication Datastore increases the number of data centers that maintain replicas of your data by using the Paxos algorithm to synchronize that data across datacenters in real time. One of the most significant benefits is that all functionality of your application will remain fully available during planned maintenance periods, as well as during most unplanned infrastructure issues.
Googles warns developers that:
because it is a distributed database, with all that implies in the CAP sense, developers will have to be very careful in how they architect their applications because as costs increased, reliability increased, complexity has increased, and performance has decreased
To help existing applications migrate their data to the High Replication Datastore, Google is providing some migration tools. Google has also increased the price by a factor of 3 due to the amount of replications.
Todd Hoff calls it a "giant step into the fully distributed future":
The HRD is targeted at mission critical applications that require data replicated to at least three datacenters, full ACID semantics for entity groups, and lower consistency guarantees across entity groups.
Googles new datastore defines a data model that lies between the abstract tuples of an RDBMS and the concrete row-column storage of NoSQL. As in an RDBMS, the data model is declared in a schema and is strongly typed. Each schema has a set of tables, each containing a set of entities, which in turn contain a set of properties. Properties are named and typed values.
Bigtable provides the ability to store multiple values in the same row/column pair with different timestamps. This feature to implement multiversion concurrency control (MVCC): when mutations within a transaction are applied, the values are written at the timestamp of their transaction. Readers use the timestamp of the last fully applied transaction to avoid seeing partial updates.
Average read latencies are tens of milliseconds, depending on the amount of data, showing that most reads are local. Most users see average write latencies of 100–400 ms, depending on the distance between datacenters, the size of the data being written, and the number of full replicas.
The cost of "big infstractucture", once only reserved to very large corporations, built for mission critical applications, is plummeting enabling the long tail to build innovative applications that would have been unthinkable a couple of years ago. Do you plan to use Google App Engine? Do you see the need of such a datastore for your solutions? What verticals will most benefit from this kind of infrastructure?
Brandon Holt, Preston Briggs, Luis Ceze, Mark Oskin May 21, 2015
Kai Kreuzer, Olaf Weinmann May 21, 2015