InfoQ Homepage Low Latency Content on InfoQ

News

RSS Feed

Newer Older

Architecture & Design

Uber Moves from Static Limits to Priority-Aware Load Control for Distributed Storage

Uber engineers detailed how they evolved their storage platform from static rate limiting to a priority-aware load management system. The approach protects Docstore and Schemaless, Uber’s MySQL-based distributed databases, by colocating control with storage, prioritizing critical traffic, and dynamically shedding load under overload conditions.

Leela Kumili
on Jan 29, 2026
Architecture & Design

From On-Demand to Live : Netflix Streaming to 100 Million Devices in under 1 Minute

Netflix’s global live streaming platform powers millions of viewers with cloud-based ingest, custom live origin, Open Connect delivery, and real-time recommendations. This article explores the architecture, low-latency pipelines, adaptive bitrate streaming, and operational monitoring that ensure reliable, scalable, and synchronized live event experiences worldwide.

Leela Kumili
on Dec 05, 2025
Architecture & Design

Reddit Migrates Comment Backend from Python to Go Microservice to Halve Latency

Reddit has rebuilt its core backend, migrating Comments, Accounts, Posts, and Subreddits from a legacy Python monolith to Go microservices. The migration improves performance, halves critical write latency, and modernizes the platform for future scalability while preserving correctness across multiple datastores.

Leela Kumili
on Nov 28, 2025
Architecture & Design

Airbnb Adds Adaptive Traffic Control to Manage Key Value Store Spikes

Airbnb upgraded Mussel, its multi-tenant key-value store, replacing static per-client rate limits with an adaptive, resource-aware traffic control system. The redesign ensures resilience during traffic spikes, protects critical workflows, and maintains fair usage across thousands of tenants while scaling efficiently.

Leela Kumili
on Nov 21, 2025
Architecture & Design

Inside Uber’s Query Architecture: Simplifying Layers and Improving Observability

Uber rebuilt its Apache Pinot query architecture, replacing the Presto-based Neutrino system with a lightweight proxy called Cellar and Pinot’s Multi-Stage Engine Lite Mode. The redesign simplifies SQL execution, improves resource management, and ensures predictable performance for large-scale analytics workloads.

Leela Kumili
on Nov 06, 2025
Architecture & Design

Uber Achieves 150M Reads per Second with CacheFront Improvements

Uber has updated its CacheFront architecture to handle over 150 million reads per second. The new design improves consistency and reduces stale reads by integrating Flux for MySQL binlog tailing, enhancing the storage engine, and introducing Cache Inspector for monitoring and optimization.

Leela Kumili
on Oct 06, 2025
Architecture & Design

Allegro Reduces Kafka Producer Latency Outliers by 82% after Switching to XFS

Allegro experimented with different performance optimization options to improve Apache Kafka producer tail latency and eventually switched all its clusters to the XFS filesystem. The company used Kafka protocol sniffing, JVM profiling, and eBPF, which proved instrumental in identifying and eliminating performance bottlenecks.

Rafal Gancarz
on Apr 26, 2024
Architecture & Design

QCon London: Lessons Learned from Building LinkedIn’s AI/ML Data Platform

At the QCon London 2024 conference, Félix GV from LinkedIn discussed the AI/ML platform powering the company’s products. He specifically delved into Venice DB, the NoSQL data store used for feature persistence. The presenter shared the lessons learned from evolving and operating the platform, including cluster management and library versioning.

Rafal Gancarz
on Apr 15, 2024
Cloud

Amazon RDS Introduces Faster Storage for High-Performance Database Workloads

AWS has recently introduced support for io2 Block Express volumes on Amazon RDS. Priced as the existing Provisioned IOPS (PIOPS) io1, the new io2 Block Express volumes are compatible with all database engines and are designed for high-performance, high-throughput, and low-latency database workloads.

Renato Losio
on Mar 24, 2024
Architecture & Design

Uber Improves Resiliency of Microservices with Adaptive Load Shedding

Uber created a new load-shedding library for its microservice platform, serving over 130 million customers and handling aggregated peaks of millions of requests per second (RPSs). The company replaced the solution based on QALM with Cinnamon library, which, in addition to graceful degradation, can dynamically and continuously adjust the capacity of the service and the amount of load shedding.

Rafal Gancarz
on Feb 06, 2024
Architecture & Design

How RevenueCat Manages Caching for Handling over 1.2 Billion Daily API Requests

RevenueCat extensively uses caching to improve the availability and performance of its product API while ensuring consistency. The company shared its techniques to deliver the platform, which can handle over 1.2 billion daily API requests. The team at RevenueCat created an open-source memcache client that provides several advanced features.

Rafal Gancarz
on Jan 29, 2024
Architecture & Design

Discord Scales to 1 Million+ Online MidJourney Users in a Single Server

Discord optimized its platform to serve over one million online users in a single server while maintaining a responsive user experience. The company evolved the guild component, which is responsible for fanning out billions of message notifications, in a series of performance and scalability improvements supported by system observability and performance tuning.

Rafal Gancarz
on Jan 26, 2024
Architecture & Design

Why LinkedIn chose gRPC+Protobuf over REST+JSON: Q&A with Karthik Ramgopal and Min Chen

LinkedIn announced that it would be moving to gRPC with Protocol Buffers for the inter-service communication in its microservices platform, where previously an open-source Rest.li framework was used with JSON as a primary serialization format. InfoQ contacted Karthik Ramgopal and Min Chen to learn more about the decision and company motivations behind it.

Rafal Gancarz
on Dec 27, 2023
Cloud

Amazon S3 Introduces High-Performance Storage Class

During the recent re:Invent conference, AWS announced the general availability of S3 Express One Zone, a high-performance, single-AZ storage class that provides single-digit millisecond data access. Reducing request costs, the new storage class is designed for processing data in AI/ML training and financial modeling.

Renato Losio
on Dec 06, 2023
Architecture & Design

LinkedIn Migrates Espresso to HTTP2 and Reduces Connections by 88% and Latency by 75%

LinkedIn was able to dramatically improve the scalability and performance of its Espresso database by migrating it from HTTP1.1 to HTTP2, resulting in a reduction in the number of connections, latency, and garbage collection times. To achieve these gains, the team had to optimize the Netty’s default HTTP2 stack to make it fit their needs.

Rafal Gancarz
on Dec 04, 2023

Newer News

Older News

InfoQ Software Architects' Newsletter

News