Eventually Consistent, Revisited
Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. Last month, Amazon’s CTO Werner Vogels posted an article describing approaches to tolerate eventual data consistency in large-scale distributed systems.
As discussed in the recent InfoQ post:
One of the key aspects of the system architect's role is to weigh up conflicting requirements and decide on a solution, often by trading off one aspect against another.
A new post by Amazon’s CTO Werner Vogels discussed how these fundamental requirements apply to the infrastructure services, providing resources for constructing Internet-scale computing platforms.
Given the worldwide scope of these systems, we use replication techniques ubiquitously to guarantee consistent performance and high availability. Although replication brings us closer to our goals, it cannot achieve them in a perfectly transparent manner; under a number of conditions the customers of these services will be confronted with the consequences of using replication techniques inside the services. One of the ways in which this manifests itself is in the type of data consistency that is provided, particularly when the underlying distributed system provides an eventual consistency model for data replication. When designing these large-scale systems at Amazon, we use a set of guiding principles and abstractions related to large-scale data replication and focus on the trade-offs between high availability and data consistency.
According to Werner, there are two ways of looking at consistency: from the developer/client point of view - how they observe data updates; and from the server point of view - how updates flow through the system and what guarantees systems can give with respect to updates.
The following should be considered when defining a client-side consistency model:
- Storage system...one should assume that under the covers it is something of large scale and highly distributed, and that it is built to guarantee durability and availability..
- Process A. This is a process that writes to and reads from the storage system.
- Processes B and C. These two processes are independent of process A and write to and read from the storage system... Client-side consistency has to do with how and when observers (in this case the processes A, B, or C) see updates made to a data object in the storage systems.
- Strong consistency. After the update completes, any subsequent access (by A, B, or C) will return the updated value.
- Weak consistency. The system does not guarantee that subsequent accesses will return the updated value. A number of conditions need to be met before the value will be returned. The period between the update and the moment when it is guaranteed that any observer will always see the updated value is dubbed the inconsistency window.
- Eventual consistency. This is a specific form of weak consistency; the storage system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value. If no failures occur, the maximum size of the inconsistency window can be determined based on factors such as communication delays, the load on the system, and the number of replicas involved in the replication scheme.
The variations of a client-side consistency model are:
- Causal consistency. If process A has communicated to process B that it has updated a data item, a subsequent access by process B will return the updated value, and a write is guaranteed to supersede the earlier write. Access by process C that has no causal relationship to process A is subject to the normal eventual consistency rules.
- Read-your-writes consistency. This is an important model where process A, after it has updated a data item, always accesses the updated value and will never see an older value. This is a special case of the causal consistency model.
- Session consistency. This is a practical version of the previous model, where a process accesses the storage system in the context of a session. As long as the session exists, the system guarantees read-your-writes consistency... the guarantees do not overlap the sessions.
- Monotonic read consistency. If a process has seen a particular value for the object, any subsequent accesses will never return any previous values.
- Monotonic write consistency. In this case the system guarantees to serialize the writes by the same process. Systems that do not guarantee this level of consistency are notoriously hard to program.
The level of consistency on a server side depends on how updates are propagated between data replicas (which is the typical way to improve throughput and provide scalability). Weak/eventual consistency happens when not all data replicas participate in the update operation and/or contacted as part of read operation. The two common scenarios where this situation occurs are massive replication for read scaling and the cases with the complicated data access. In most of these systems the updates are propagated in a lazy manner to the remaining nodes in the replica's set. The period until all replicas have been updated is the inconsistency window and the overall system is vulnerable to reading from nodes that have not yet received the updates. The level of consistency provided by a server can be improved through specific implementations of client/server communications or by the client itself:
Whether or not read-your-writes, session, and monotonic consistency can be achieved depends in general on the "stickiness" of clients to the server that executes the distributed protocol for them. If this is the same server every time, then it is relatively easy to guarantee read-your-writes and monotonic reads. This makes it slightly harder to manage load balancing and fault tolerance, but it is a simple solution... Sometimes the client implements read-your-writes and monotonic reads. By adding versions on writes, the client discards reads of values with versions that precede the last-seen version.
Each client application has its own tolerance to inconsistencies provided by a server, but in all cases it should be aware of the consistency level that the server application provides. There are a number of practical improvements to the eventual consistency model, such as session-level consistency and monotonic reads, which provide better tools for the developer.