InfoQ Homepage Consistency Content on InfoQ
-
From Outages to Order: Netflix’s Approach to Database Resilience with WAL
Netflix uses a Write-Ahead Log (WAL) system to improve data platform resilience, addressing data loss, replication entropy, multi-partition failures, and corruption. WAL decouples producers and consumers, leverages SQS/Kafka with dead-letter queues, and supports delay queues, cross-region replication, and multi-table mutations for high-throughput, consistent, and recoverable database operations.
-
Uber Achieves 150M Reads per Second with CacheFront Improvements
Uber has updated its CacheFront architecture to handle over 150 million reads per second. The new design improves consistency and reduces stale reads by integrating Flux for MySQL binlog tailing, enhancing the storage engine, and introducing Cache Inspector for monitoring and optimization.
-
Cloudflare Rearchitects Workers KV Following GCP Outage, Achieves 40x Performance Gain
Cloudflare has recently redesigned Workers KV with a hybrid storage architecture that automatically routes objects between distributed databases and object storage based on size characteristics, while operating dual storage backends. This change improved the p99 read latencies from 200ms to under 5ms for their global key-value store while handling hundreds of billions of key-value pairs.
-
Netflix Revamps Tudum’s CQRS Architecture with RAW Hollow In-Memory Object Store
Netflix replaced a CQRS implementation using Kafka and Cassandra with a new solution leveraging RAW Hollow, an in-memory object store developed internally. Revamped architecture of Tudum offers much faster content preview during the editorial process and faster page rendering for visitors.
-
Fast Eventual Consistency: Inside Corrosion, the Distributed System Powering Fly.io
Innovative cloud solutions expert Somtochi Onyekwere recently presented at QCon London 2025, unveiling Corrosion—Fly.io's advanced open-source distributed system. By leveraging CRDTs and Rust, Corrosion enhances scalability and data synchronization, addressing latency challenges and ensuring rapid, consistent application deployment across a global network of 40+ regions.
-
How Monzo Bank Built a Cost-Effective, Unorthodox Backup System to Ensure Resilient Banking
Monzo Bank recently revealed Stand-in, an independent backup system on GCP that ensures essential banking services remain operational during application and AWS infrastructure outages. Unlike traditional backups, it's a minimal stand-alone system that exclusively supports key operations and features a cost-effective design, resulting in 1% of the operational costs of the primary deployment.
-
Improving Distributed System Data Integrity with Amazon S3 Conditional Writes
AWS recently announced support for conditional writing in Amazon S3, allowing users to check for the existence of an object before creating it. This feature helps prevent overwriting existing objects when uploading data, making it easier for applications to manage data.
-
How RevenueCat Manages Caching for Handling over 1.2 Billion Daily API Requests
RevenueCat extensively uses caching to improve the availability and performance of its product API while ensuring consistency. The company shared its techniques to deliver the platform, which can handle over 1.2 billion daily API requests. The team at RevenueCat created an open-source memcache client that provides several advanced features.
-
Cloudflare Previews Serverless Database D1 Adding Transactions
Cloudflare recently unveiled more details about the serverless database D1, the new service supporting SQLite to store and query relational data globally with low latency. D1 is the first SQL database from the content delivery network company and will support transactions.
-
Email Classification at Slack: Designing an Eventually Consistent Custom Classifier
Slack recently published the details of how it built an email address classification engine that can determine if an email address is internal or external. Slack engineers utilized an eventually consistent near real-time representation of the data in its system and implemented a drift detection mechanism to fix erroneous data, keeping the engine's operation in order.
-
Amazon S3 Now Delivers Strong Read-After-Write Consistency
To guarantee higher availability and better performances, S3 has for years relied on an eventual consistency model. During the first week of re:invent, AWS announced that S3 now supports strong read-after-write consistency.
-
Causal Consistency for Large Neo4j Clusters
Jim Webber, chief scientist at Neo4J Technology, explored how Neo4J implements causal consistency at QCon London 2017. The presentation included a high-level overview of Neo4J’s clustering architecture, its implementation of consensus using Raft, and a pattern called bookmarking used to achieve read-after-write consistency.
-
Real-World Consistency Explained: Uwe Friedrichsen Discusses His Favourite Academic Papers
At the microXchg 2016 conference, held in Berlin, Germany, Uwe Friedrichsen presented a deep-dive into “real-world consistency explained”. Friedrichsen referenced multiple academic papers and discussed topics such as ACID vs BASE, his belief that many developers may not fully understand consistency guarantees with a typical SQL database, and how consistency affects microservice systems.
-
Scaling Stateful Services
Caitie McCaffrey, distributed systems engineer at Twitter, talked about the benefits of stateful services which are less known than their stateless counterparts in the industry and how they can be scaled at the Strange Loop conference. The benefits include data locality and higher availability and stronger consistency models. McCaffrey also gave real world examples of stateful services.
-
Alternatives to Eventual Consistency
Causal Consistency models offer an alternative Eventual Consistency for distributed systems; both models should be weighed against your system's requirements and risk tolerance.