QCon London: Ensuring Data Consistency in Distributed Systems Using CRDTs

| by Jan Stenberg Follow 23 Followers on Mar 12, 2018. Estimated reading time: 2 minutes | NOTICE: The next QCon is in San Francisco April 9-10, 2018. Join us!

A note to our readers: You asked so we have developed a set of features that allow you to reduce the noise: you can get email and web notifications for topics you are interested in. Learn more about our new features.

Conflict-free Replicated Data Types (CRDTs) is a family of algorithms for ensuring strong eventual consistency in distributed systems without the use of a centralized server, which has now been theoretically proven to work, Martin Kleppmann claimed in a presentation at QCon London 2018, where he explored algorithms allowing people to collaborate on shared documents and how they can be used to synchronize data.

Kleppmann, researcher at University of Cambrige, noted that the problem with people editing the same document simultaneously has been studied since the late 80s. The goal is to find algorithms that can achieve convergence — at the end of editing everybody has the same document on the screen.

One family of algorithms are Operational Transformations (OT). Several algorithms have been developed but most of them have failed, leaving documents out of sync with each other. The only ones that have succeeded are the ones using a centralized server, which is used in for instance Google Docs. Kleppmann thinks this is a severe restriction — the data must go through a datacentre somewhere even when you are syncing some data between a mobile phone and laptop that are connected to the same local area network.

Conflict-Free Replicated Data Types (CRDTs) is another family of algorithms, used for instance in the Atom text editor. They resolve conflicts in a peer-to-peer fashion without using a centralized server, but still with a convergence guarantee. Given the problems with operation transformation algorithms, Kleppmann and his colleagues wanted to formally prove that CRDTs converge in all circumstances, even in odd edge cases.

To prove convergence, they used Isabelle, a theorem proving software, and were able to show that the CRDT algorithms they tested satisfied a consistency property called strong eventual consistency. They also had a second layer of proof using a model of a very unreliable network and could prove that the algorithms satisfied convergence guarantees in all possible executions.

An implementation of their ideas is Automerge, which is a data structure library on top of which you can build collaborative applications. The library provides a JSON-like data structure or abstraction that can be modified as a normal JSON document. Automerge is written in JavaScript and is open source, but Kleppmann emphasizes that this is research code, it is not production-ready.

Two theoretical issues that are still not solved are how to handle an undo of a change, and moving subtrees in an atomic way. Kleppmann notes that moving subtrees is not the same as deleting and reinserting it because that could lead to duplicates in some scenarios.

A significant problem is the overhead in memory space, and this is one of the areas where Kleppmann currently works. Documents with up to about 100k characters can normally be handled, but over this size you may run into heap size issues. By representing the data in an efficiently binary packed way he has decreased the overhead down to less three times the size of the plain text, which he believes solves the problem for most cases.

The slides from Kleppmann’s presentation are available for download. Most presentations at the conference were recorded and will be available on InfoQ over the coming months. The next QCon conference,, will focus on AI and machine learning and is scheduled for April 9 – 11, 2018, in San Francisco. QCon London 2019 is scheduled for March 4 - 8, 2019.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread


Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you