BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Distributed Transactions at Scale in Amazon DynamoDB

Distributed Transactions at Scale in Amazon DynamoDB

This item in japanese

Bookmarks

Key Takeaways

  • NoSQL cloud database services are popular for their key-value operations, high availability, high scalability, and predictable performance. These characteristics are generally considered to be at odds with support for transactions. DynamoDB supports transactions, and it does it without compromising on performance, availability or scale.
  • DynamoDB added transactions using a timestamp ordering protocol while exploiting the semantics of a key-value store to achieve low latency for both transactional and non-transactional operations.
  • Amazon DynamoDB introduced two novel single-request operations: TransactGetItems and TransactWriteItems. These operations allow the execution of a set of operations atomically and serializably for any items in any table. Transactional operations on Amazon DynamoDB provide atomicity, consistency, isolation, and durability (ACID) guarantees within the region.
  • The results of experiments against a production implementation demonstrate that distributed transactions with full ACID properties can be supported without compromising on performance, availability, or scale.

Can we support transactions at scale with predictable performance? In this article, I explore why transactions are considered at odds with scalability for NoSQL databases and walk you through the journey of how we added transactions to Amazon DynamoDB.

NoSQL Databases

NoSQL databases, like DynamoDB, have gained adoption because of their flexible data model, simple interface, scale, and performance. Core features of relational databases, including SQL queries and transactions, were sacrificed to provide automatic partitioning for unlimited scalability, replication for fault-tolerance, and low latency access for predictable performance.

Amazon DynamoDB (not to be confused with Dynamo) powers applications for hundreds of thousands of customers and multiple high-traffic Amazon systems including Alexa, the Amazon.com sites, and all Amazon fulfillment centers.

In 2023, over the course of Prime Day, Amazon systems made trillions of calls to the DynamoDB API, and DynamoDB maintained high availability while delivering single-digit millisecond responses and peaking at 126 million requests per second.

When customers of DynamoDB requested ACID transactions, the challenge was how to integrate transactional operations without sacrificing the defining characteristics of this critical infrastructure service: high scalability, high availability, and predictable performance at scale.

To understand why transactions are important, let's walk through an example of building an application without support for transactions in a NoSQL database, using only basic Put and Get operations.

Transactions

A transaction is a set of read and write operations that are executed together as a single logical unit. Transactions are associated with ACID properties:

  1. Atomicity ensures that either all or none of the operations in the transaction are executed, providing an all-or-nothing semantic.
  2. Consistency ensures that the operation results in a consistent and correct state for the database.
  3. Isolation allows multiple developers to read or write data concurrently, ensuring that concurrent operations are serialized.
  4. Durability guarantees that any data written during the transaction remains permanent.

Why do we need transactions in a NoSQL database? The value of transactions lies in their ability to help construct correct and reliable applications that need to maintain multi-item invariants. Invariants of this nature are commonly encountered in a wide range of applications. Imagine an online e-commerce application where a user, Mary, can purchase a book and a pen together as a single order. Some invariants in this context could be - a book cannot be sold if it is out of stock, a pen cannot be sold if it is out of stock, and Mary must be a valid customer in order to purchase both a book and a pen.

Figure 1: Simple e-commerce scenario

However, maintaining these invariants can be challenging, particularly when multiple instances of an application run in parallel and access the data concurrently. Furthermore, the task of preserving multi-item invariants becomes challenging in the event of failures such as node failures. Transactions provide a solution for applications to address both challenges of concurrent access and partial failures, alleviating the need for developers to write excessive amounts of additional code to deal with these two challenges.

Imagine that you are developing a client-side e-commerce application relying on a database without transactional support to create Mary’s order. Your application has three tables - the inventory, the customer, and the orders tables. When you want to execute a purchase, what do you need to consider?

Figure 2: Three separate NoSQL tables for inventory, customers, and orders

First, you need to ensure that Mary is a verified customer. Next, you need to check to ensure the book is in stock and in a sellable condition. You also need to do the same checks for the pen, then you need to create a new order and update the status and count of the book and pen in the inventory. One way to achieve this is by writing all the necessary logic on the client side.

The crucial aspect is that all the operations must execute atomically to ensure that the final state has the correct values and no other readers see inconsistent state of the database while the purchase order is getting created. Without transactions, if multiple users access the same data simultaneously, there is a possibility of encountering inconsistent data. For instance, a book might be marked as sold to Mary, but the order creation could fail. Transactions provide a means to execute these operations as a logical unit, ensuring that they either all succeed or all fail, while preventing customers from observing inconsistent states.

Figure 3: What if we experience a crash without transactions?

Building an application without transactions involves navigating through various other potential pitfalls, such as network failures and application crashes. To mitigate these challenges, it becomes necessary to implement additional client-side logic for robust error handling and resilience. The developer needs to implement rollback logic, deleting unfinished transactions. Multi-user scenarios introduce another layer of complexity, needing to ensure that the data stored in the tables is consistent across all users.

Transactions and NoSQL Concerns

There are often concerns about the tradeoffs that come with implementing a transaction system in a database. NoSQL databases are expected to provide low latency performance and scalability, offering often only Get and Put operations that have consistent latency.

Figure 4: Can transactions provide predictable performance?

Many NoSQL databases do not provide transactions, with common concerns being breaking non-transactional workloads, the complexity of the APIs, and system issues such as deadlocks, contentions, and interference between non-transactional and transactional workloads. Some databases attempted to address these issues by providing them with restricted features like isolation levels or limiting the scope of transactions, allowing them to be executed in a single partition. Others enforce constraints on the primary or hash key or require upfront identification of all partitions expected to be part of a transaction.

These restrictions are designed to make the system more predictable and reduce complexity, but they come at the expense of scalability. As the database grows and splits into multiple partitions, data restriction to a single partition could lead to availability issues.

DynamoDB Transaction Goals

When we set out to add transaction support to DynamoDB, the team aimed for providing customers with the capability to perform atomic and serializable execution of operations on items across tables within a specific region with predictable performance, and no impact on non-transactional workloads.

Customer Experience

Let's focus on the customer experience and explore the options for offering transaction support in DynamoDB. Traditionally, transactions are initiated with a "begin transaction" statement and concluded with a "commit transaction". In between, customers can write all the Get and Put operations. In this approach, existing operations on single item can be simply treated as implicit transactions, consisting of a single operation. To ensure isolation, two phase locking can be used, while achieving atomicity through two phase commits.

However, DynamoDB is a multi-tenant system, and allowing long-running transactions could tie up system resources indefinitely. Enforcing the full transactional commit protocol for singleton Get and Put operations would have an adverse impact on performance for current customers who do not intend to utilize transactions. In addition, locking introduces the risk of deadlocks, which can significantly impact system availability.

Instead, we introduced two novel single-request operations - TransactGetItems and TransactWriteItems. These operations are executed atomically and in a serializable order with respect to other DynamoDB operations. TransactGetItems is designed for read-only transactions which retrieve multiple items from a consistent snapshot. This means that the read-only transaction is serialized with respect to other write transactions. TransactWriteItems is a synchronous and idempotent write operation that allows multiple items to be created, deleted, or updated atomically in one or more tables.

Such transaction may optionally include one or more preconditions on the current values of the items. Preconditions enable checking for specific conditions on item attributes, such as existence, specific values, or numerical ranges. DynamoDB rejects the TransactWriteItems request if any of the preconditions are not met. Preconditions can be added not only for items that are modified but also for items that are not modified in the transaction.

These operations don't restrict concurrency, don't require versioning, don't impact the performance of singleton operations, and allow for optimistic concurrency control on individual items. All transactions and singleton operations are serialized to ensure consistency. With TransactGetItems and TransactWriteItem, DynamoDB provides a scalable and cost-effective solution that meets ACID compliance.

Consider another example that shows the utilization of transactions in the context of a bank money transfer. Let's assume Mary wants to transfer funds to Bob. Traditional transactions involve reading Mary and Bob's account balances, checking funds availability, and executing a transaction within a TxBegin and TxCommit block. In DynamoDB, you can accomplish the same transactional behavior with a single request using the TransactWriteItems operation: checking balances, performing a transfer with TransactWriteItems, eliminating the need for TxBegin and TxCommit.

Transactions High Level Architecture

To better understand how transactions were implemented, let's delve into workflow of a DynamoDB request. When an application requests a Put/Get operation, the request is routed to a request router randomly selected by a front-end load balancer. The request router leverages a metadata service to map the table name and primary key to the set of storage nodes that store the item being accessed.

Figure 5: The router

Data in DynamoDB is replicated across multiple availability zones, with one replica serving as the leader. In the case of a Put operation, the request is routed to the leader storage node, which then propagates the data to other storage nodes in different availability zones. Once a majority of replicas have successfully written the item, a completion response is sent back to the application. Delete and update operations follow a similar process. Gets are similar except they are processed by a single storage node. In the case of consistent reads, leader replica serves the read request. However, for eventually consistent reads, any of the three replicas can serve the request.

To implement transactions, a dedicated fleet of transaction coordinators was introduced. Any transaction coordinator in the fleet can take responsibility for any transaction. When a transactional request is received, the request routers perform the needed authentication and authorization of the request and forward it to one of the transaction coordinators. These coordinators handle the routing of requests to the appropriate storage nodes responsible for the items involved in the transaction. After receiving the responses from the storage nodes, the coordinators generate a transactional response for the client, indicating the success or failure of the transaction.

Figure 6: The transaction coordinator

The transaction protocol is a two-phase process to ensure atomicity. In the first phase, the transaction coordinator sends a Prepare message to the leader storage nodes for the items being written. Upon receiving the Prepare message, each storage node verifies if the preconditions for the items are satisfied. If all the storage nodes accept the Prepare, transaction proceeds to the second phase.

In this phase, the transaction coordinator commits the transaction and instructs the storage nodes to perform their writes. Once a transaction enters the second phase, it is guaranteed to be executed in its entirety exactly once. The coordinator retries each write operation until all writes succeed. Since the writes are idempotent, the coordinator can safely resend a write request in case of scenarios like encountering a timeout.

Figure 7: When a transaction fails

If the Prepare message is not accepted by any of the participating storage nodes, then the transaction coordinator will cancel the transaction. To cancel, the transaction coordinator sends a Release message to all the participating storage nodes and sends a response to the client, indicating that the transaction has been canceled. Since no writes occur during the first phase, there is no need for a rollback process.

Transactions Recovery

To ensure atomicity of transactions and ensure completion of transactions in the event of failures, coordinators maintain a persistent record of each transaction and its outcome in a ledger. Periodically, a recovery manager scans the ledger to identify transactions that have not yet been completed. Such transactions are assigned to a new transaction coordinator which resumes execution of the transaction protocol. It is acceptable for multiple coordinators to work on the same transaction simultaneously, as the commit and release operations are idempotent.

Figure 8: The transaction coordinator and failures

Once the transaction has been successfully processed, it is marked as completed in the ledger, indicating that no further actions are necessary. The information about the transaction is purged from the ledger 10 minutes after completion of the transaction to support idempotent TransactWriteItems requests. If a client reissues the same request within this 10-minute timeframe, the information will be looked up from the ledger to ensure the request is idempotent.

Figure 9: The transaction coordinator and the ledger

Ensuring Serializability

Timestamp ordering is used to define the logical execution order of transactions. Upon receiving a transaction request, the transaction coordinator assigns a timestamp to the transaction using the value of its current clock. Once a timestamp has been assigned, the nodes participating in the transaction can perform their operations without coordination. Each storage node is responsible for ensuring that the requests involved in the items are executed in the proper order and for rejecting conflicting transactions that may come out of order. If each transaction executes at the assigned timestamp, serializability is achieved.

Figure 10: Using a timestamp-based ordering protocol

To handle the load from transactions, a large number of transaction coordinators operate in parallel. To prevent unnecessary transaction aborts due to unsynchronized clocks, the system uses a time sync service provided by AWS to keep the clocks in coordinator fleets closely in sync. However, even with perfectly synchronized clocks, transactions can arrive at storage nodes out of order due to delays, network failures, and other issues. Storage nodes deal with transactions that arrive in any order using stored timestamps.

TransactGetItems

The TransactGetItems API works similarly to the TransactWriteItems API but does not use the ledger to avoid latency and cost. TransactGetItems implements a two-phase write-less protocol for executing read transactions. In the first phase, the transaction coordinator reads all the items in the transaction's read-set. If any of these items are being written by another transaction, then the read transaction is rejected; otherwise, the read transaction moves to the second phase.

In its response to the transaction coordinator, the storage node not only returns the item's value but also its current committed log sequence number (LSN), representing the last acknowledged write by the storage node. In the second phase, the items are re-read. If there have been no changes to the items between the two phases, the read transaction returns successfully with fetched item values. However, if any item has been updated between the two phases, the read transaction is rejected.

Transactional vs Non-transactional Workloads

To ensure no performance degradation for applications not using transactions, non-transactional operations bypass the transaction coordinator and the two-phase protocol. These operations are directly routed from request routers to storage nodes, resulting in no performance impact.

Transaction goals revisited

What about the scalability concerns we raised at the beginning? Let’s see what we achieved by adding transactions to DynamoDB:

  • Traditional Get/Put operations have not been affected and have the same performance as for not transactional workloads.
  • The TransactGetItems API works similarly to the TransactWriteItems API but does not use the ledger to avoid latency and cost.
  • All operations maintain the same latency as the system scales.
  • Utilizing single-request transactions and timestamp ordering we have both transactions and scalability.

Figure 11: Predictable latency for transactions

Best Practices

What are the best practices for using transactions on DynamoDB?

  1. Idempotent write transactions: When making a TransactWriteItems call, you have the option to include a client token to ensure the request is idempotent. Incorporating idempotence into your transactions helps prevent potential application errors in case the same operation is inadvertently submitted multiple times. This feature is available by default when utilizing the AWS SDKs.
  2. Auto-scaling or on-demand: It is recommended to enable auto-scaling or utilize on-demand tables. This ensures that the necessary capacity is available to handle the transaction workload effectively.
  3. Avoid transactions for bulk loading: For bulk loading purposes, it is more cost-effective and efficient to utilize the DynamoDB bulk Import feature instead of relying on transactions.

DynamoDB transactions have been greatly influenced by the invaluable feedback of our customers, who inspire us to innovate on their behalf. I am grateful to have such an outstanding team by my side throughout this journey. Special thanks to Elizabeth Solomon, Prithvi Ramanathan and Somu Perianayagam for reviewing the article and sharing their feedback to refine this article. You can learn more about DynamoDB in the paper published at USENIX ATC 2022 and about DynamoDB transactions in the paper published at USENIX ATC 2023.

About the Author

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Why do the traditional Put operations have not been affected by TransactWriteItems?

    by Young Lion,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Why do the traditional Put operations have not been affected by TransactWriteItems? Is there any storage node locks for TransactWriteItems phase one and Put? If there’s not, how to handle the concurrency of Puts and TransactWriteItems of the same keys? If there is, then the Puts should be affected maybe?

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT