InfoQ

News

Consistency vs. availability: eventual consistency by Werner Vogels

Posted by Sadek Drobi on Jan 11, 2008 01:00 PM

Community
Architecture
Topics
Performance & Scalability
Tags
Concurrency ,
Amazon

Until the mid nineties, in the context of data replication, achieving distribution transparency and data consistency has often been the priority. However, as large internet systems started to arise, availability became another important concern to be taken into consideration. The CAP theorem presented by Eric Brewer, states that “of three properties of shared-data systems; data consistency, system availability and tolerance to network partition one can only achieve two at any given time”. Given that “in larger distributed scale systems, network partitions are a given”, either consistency or availability have to be relaxed. 

In this context, the concept of eventual consistency started to win the ground. Along the lines of his presentation at QCon London 2007, Werner Vogels has recently outlined on his blog some principles, abstractions and trade-offs related to large scale data replication and consistency requirements.  

He emphasizes that consistency is not an absolute priority: 

Inconsistency can be tolerated for two reasons: for improving read and write performance under highly concurrent conditions and for handling partition cases where a majority model would render part of the system unavailable even though the nodes are up and running.

Whether inconsistency is acceptable depends foremost on the client application. Vogels gives an example of a website where what really counts is the “user-percieved consistency”, the fact that the inconsistency window - i.e. “the period between the update and the moment when it is guaranteed that any observer will always see the updated value” - is “smaller than the time expected for the customer to return for the next page load”, so that updates propagate through the system before the next read is expected.  

More generally speaking there are, according to Vogels, “two ways of looking at consistency”.  

One is from the developer / client point of view; how they observe data updates. The second way is from the server side; how updates flow through the system and what guarantees systems can give with respect to updates.

At the client side, Vogels identifies four components: a storage system perceived as a “black box” by observers who are themselves represented by three processes: “process A […] that writes to and reads from the storage system” and “process B & C […], independent of process A that also write to and read from the storage system”. These processes are “independent and need to communicate the share information.” The client side consistency is all about “how and when an observer (in this case processes A, B or C) sees updates made to a data object in the storage systems.” 

The degree of consistency may vary:  

  • Strong consistency. After the update completes any subsequent access (by A, B or C) will return the updated value.  
  • Weak consistency. The system does not guarantee that subsequent accesses will return the updated value. A number of conditions need to be met before the value will be returned. Often this condition is the passing of time, [i.e.] the inconsistency window.  
  • Eventual consistency. The storage system guarantees that if no new updates are made to the object eventually (after the inconsistency window closes) all accesses will return the last updated value.  

Vogels also outlines for the eventual consistency model different variations that can be combined:  

  • Causal consistency. If process A has communicated to process B that it has updated a data item, a subsequent access by process B will return the updated value and a write is guaranteed to supersede the earlier write. Access by process C that has no causal relationship to process A is subject to the normal eventual consistency rules.  
  • Read-your-writes consistency. This is an important model where process A after it has updated a data item always accesses the updated value and never will see an older value. This is a special case of the causal consistency model.  
  • Session consistency. This is a practical version of the previous model, where a process accesses the storage system in the context of a session. As long as the session exists, the system guarantees read-your-writes consistency. If the session terminates because of certain failure scenarios a new session needs to be created, and the guarantees do not overlap the sessions.  
  • Monotonic read consistency. If a process has seen a particular value for the object any subsequent accesses will never return any previous values.  
  • Monotonic write consistency. In this case the system guarantees to serialize the writes by the same process. Systems that do not guarantee this level of consistency are notoriously hard to program.  

At the server-side, the concern is how to achieve the required degree of consistency and availability. Vogels gives different scenarios, where “N is the number of nodes that store replicas of the data, W – the number of replicas that need to acknowledge the receipt of the update before the update completes; and R – the number of replicas that are contacted when a data object is accessed through a read operation”  

If W+R > N than the write set and the read set always overlap and one can guarantee strong consistency. […] The problems with these configurations, which are basic quorum protocols, is that when because of failures the system cannot write to W nodes, the write operation has to fail, marking the unavailability of the system. 

[…] 

In R=1 and N=W we optimize for the read case and in the W=1 and R=N we would optimize for a very fast write. Of course in the latter case, durability is not guaranteed in the presence of failures, and if W < (N+1)/2 there is the possibility of conflicting writes because write sets do not overlap.  

Weak/eventually consistency arises when W+R <= N, meaning that there is no overlap in the read and write set. If this configuration is deliberate and not based on a failure case, than it hardly makes sense to set R to anything else but 1. 

[…] 

If W+R <= N than the system is vulnerable to reading from nodes that have not yet received the updates.  

Whether or not read-your-write, session and monotonic consistency can be achieved depends in general on the “stickiness” of clients to the server that executes the distributed protocol for them. If this is the same server every time than it is relatively easy to guarantee read-your-writes and monotonic reads. This makes it slightly harder to manage load balancing and fault-tolerance, but it is a simple solution. Using sessions, which are sticky, makes this explicit and provides an exposure level that clients can reason about.  

No comments

Watch Thread Reply

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.