Consistency vs. availability: eventual consistency by Werner Vogels
Until the mid nineties, in the context of data replication, achieving distribution transparency and data consistency has often been the priority. However, as large internet systems started to arise, availability became another important concern to be taken into consideration. The CAP theorem presented by Eric Brewer, states that “of three properties of shared-data systems; data consistency, system availability and tolerance to network partition one can only achieve two at any given time”. Given that “in larger distributed scale systems, network partitions are a given”, either consistency or availability have to be relaxed.
In this context, the concept of eventual consistency started to win the ground. Along the lines of his presentation at QCon London 2007, Werner Vogels has recently outlined on his blog some principles, abstractions and trade-offs related to large scale data replication and consistency requirements.
He emphasizes that consistency is not an absolute priority:
Inconsistency can be tolerated for two reasons: for improving read and write performance under highly concurrent conditions and for handling partition cases where a majority model would render part of the system unavailable even though the nodes are up and running.
Whether inconsistency is acceptable depends foremost on the client application. Vogels gives an example of a website where what really counts is the “user-percieved consistency”, the fact that the inconsistency window - i.e. “the period between the update and the moment when it is guaranteed that any observer will always see the updated value” - is “smaller than the time expected for the customer to return for the next page load”, so that updates propagate through the system before the next read is expected.
More generally speaking there are, according to Vogels, “two ways of looking at consistency”.
One is from the developer / client point of view; how they observe data updates. The second way is from the server side; how updates flow through the system and what guarantees systems can give with respect to updates.
At the client side, Vogels identifies four components: a storage system perceived as a “black box” by observers who are themselves represented by three processes: “process A […] that writes to and reads from the storage system” and “process B & C […], independent of process A that also write to and read from the storage system”. These processes are “independent and need to communicate the share information.” The client side consistency is all about “how and when an observer (in this case processes A, B or C) sees updates made to a data object in the storage systems.”
The degree of consistency may vary:
- Strong consistency. After the update completes any subsequent access (by A, B or C) will return the updated value.
- Weak consistency. The system does not guarantee that subsequent accesses will return the updated value. A number of conditions need to be met before the value will be returned. Often this condition is the passing of time, [i.e.] the inconsistency window.
- Eventual consistency. The storage system guarantees that if no new updates are made to the object eventually (after the inconsistency window closes) all accesses will return the last updated value.
Vogels also outlines for the eventual consistency model different variations that can be combined:
- Causal consistency. If process A has communicated to process B that it has updated a data item, a subsequent access by process B will return the updated value and a write is guaranteed to supersede the earlier write. Access by process C that has no causal relationship to process A is subject to the normal eventual consistency rules.
- Read-your-writes consistency. This is an important model where process A after it has updated a data item always accesses the updated value and never will see an older value. This is a special case of the causal consistency model.
- Session consistency. This is a practical version of the previous model, where a process accesses the storage system in the context of a session. As long as the session exists, the system guarantees read-your-writes consistency. If the session terminates because of certain failure scenarios a new session needs to be created, and the guarantees do not overlap the sessions.
- Monotonic read consistency. If a process has seen a particular value for the object any subsequent accesses will never return any previous values.
- Monotonic write consistency. In this case the system guarantees to serialize the writes by the same process. Systems that do not guarantee this level of consistency are notoriously hard to program.
At the server-side, the concern is how to achieve the required degree of consistency and availability. Vogels gives different scenarios, where “N is the number of nodes that store replicas of the data, W – the number of replicas that need to acknowledge the receipt of the update before the update completes; and R – the number of replicas that are contacted when a data object is accessed through a read operation”
If W+R > N than the write set and the read set always overlap and one can guarantee strong consistency. […] The problems with these configurations, which are basic quorum protocols, is that when because of failures the system cannot write to W nodes, the write operation has to fail, marking the unavailability of the system.
In R=1 and N=W we optimize for the read case and in the W=1 and R=N we would optimize for a very fast write. Of course in the latter case, durability is not guaranteed in the presence of failures, and if W < (N+1)/2 there is the possibility of conflicting writes because write sets do not overlap.
Weak/eventually consistency arises when W+R <= N, meaning that there is no overlap in the read and write set. If this configuration is deliberate and not based on a failure case, than it hardly makes sense to set R to anything else but 1.
If W+R <= N than the system is vulnerable to reading from nodes that have not yet received the updates.
Whether or not read-your-write, session and monotonic consistency can be achieved depends in general on the “stickiness” of clients to the server that executes the distributed protocol for them. If this is the same server every time than it is relatively easy to guarantee read-your-writes and monotonic reads. This makes it slightly harder to manage load balancing and fault-tolerance, but it is a simple solution. Using sessions, which are sticky, makes this explicit and provides an exposure level that clients can reason about.