InfoQ Homepage Distributed Systems Content on InfoQ
-
Netflix Serves 84% of Query Results from Cache with Interval-Aware Caching in Apache Druid
Netflix improves Apache Druid performance with interval aware caching, serving 84% of analytics results from cache and reducing query load by 33%. The system decomposes rolling window queries into reusable time segments, enabling partial cache reuse and recomputation only for recent data. At scale, it reduces scan volume, improves P90 latency, and optimizes real time analytics workloads.
-
OpenAI Introduces Websocket-Based Execution Mode to Reduce Latency in Agentic Workflows
OpenAI introduces a WebSocket-based execution mode for its Responses API to improve agentic workflow performance in coding agents and real-time AI systems. The update reduces latency by up to 40 percent by replacing HTTP request-response cycles with persistent connections, improving streaming, tool execution, and multi-step orchestration in production-scale AI systems.
-
Designing Memory for AI Agents: inside Linkedin’s Cognitive Memory Agent
LinkedIn introduces Cognitive Memory Agent (CMA), generative AI infrastructure layer enabling stateful, context-aware systems. It provides persistent memory across episodic, semantic, and procedural layers, supporting multi-agent coordination, retrieval, and lifecycle management. CMA addresses LLM statelessness and enables production-grade personalization and long-term context in AI applications.
-
Pinterest Reduces Spark OOM Failures by 96% through Auto Memory Retries
Pinterest Engineering cut Apache Spark out-of-memory failures by 96% using improved observability, configuration tuning, and automatic memory retries. Staged rollout, dashboards, and proactive memory adjustments stabilized data pipelines, reduced manual intervention, and lowered operational overhead across tens of thousands of daily jobs.
-
Discord Engineers Add Distributed Tracing to Elixir's Actor Model without Performance Penalty
Discord engineering detailed how they added distributed tracing to Elixir's actor model. Their custom Transport library wraps messages with trace context and uses dynamic sampling to handle million-user fanouts. CPU optimizations included skipping unsampled traces and filtering context before deserialization, recovering 10+ percentage points of overhead.
-
Inside Agoda’s Storefront: a Latency-Aware Reverse Proxy for Improving DNS Based Load Distribution
Agoda engineers developed Storefront, a Rust-based S3-compatible reverse proxy that improves load balancing, request routing, and observability across large-scale object storage systems. The proxy addresses DNS-based distribution limitations, implements latency-aware routing, cross-data-center optimizations, IO safeguards, credential-less authentication, and exposes telemetry via OpenTelemetry.
-
Inside Netflix’s Graph Abstraction: Handling 650TB of Graph Data in Milliseconds Globally
Netflix engineers built Graph Abstraction, a high-throughput platform managing 650 TB of graph data with millisecond latency. Supporting services from Netflix Gaming’s social graphs to operational topology graphs, it maintains global availability via asynchronous replication. This article covers its architecture, caching, and traversal design for high-scale performance.
-
From Minutes to Seconds: Uber Boosts MySQL Cluster Uptime with Consensus Architecture
Uber redesigned its MySQL fleet using a consensus-driven architecture based on MySQL Group Replication, reducing cluster failover time from minutes to seconds. By moving leader election and failure detection into the database layer, Uber improved availability, simplified external orchestration, and strengthened consistency across thousands of production clusters.
-
Hybrid Cloud Data at Uber: How Engineers Solved Extreme-Scale Replication Challenges
Uber’s HiveSync team optimized Hadoop Distcp to handle multi-petabyte replication across hybrid cloud and on-premise data lakes. Enhancements include task parallelization, Uber jobs for small transfers, and improved observability, enabling 5x replication capacity and seamless on-premise-to-cloud migration.
-
Uforwarder: Uber’s Scalable Kafka Consumer Proxy for Efficient Event-Driven Microservices
Uber has open-sourced uForwarder, a push-based Kafka consumer proxy built to handle trillions of messages and multiple petabytes of data daily. The system introduces context-aware routing, head-of-line blocking mitigation, adaptive auto-rebalancing, and partition-level delay processing to improve scalability, workload isolation, and hardware efficiency in large-scale event-driven microservices.
-
How Dropbox Built a Scalable Context Engine for Enterprise Knowledge Search
Dropbox engineers have detailed how the company built the context engine behind Dropbox Dash, revealing a shift toward index-based retrieval, knowledge graph-derived context, and continuous evaluation to support enterprise AI at scale.
-
Uber and OpenAI Retool Rate Limiting Systems
Uber and OpenAI are replacing static rate limits with adaptive, infrastructure-level platforms. Uber’s Global Rate Limiter utilizes probabilistic shedding to manage 80M RPS, while OpenAI’s Access Engine implements a credit waterfall to prevent user interruptions. Both architectures utilize distributed enforcement and soft controls to maintain system stability and service continuity at scale.
-
GitHub Reworks Layered Defenses after Legacy Protections Block Legitimate Traffic
GitHub engineers recently traced user reports of unexpected “Too Many Requests” errors to abuse-mitigation rules that had accidentally remained active long after the incidents that prompted them.
-
Cloudflare Open Sources tokio‑quiche, Promising Easier QUIC and HTTP/3 in Rust
Cloudflare has open-sourced tokio-quiche, an asynchronous QUIC and HTTP/3 Rust library that wraps its battle-tested quiche implementation with the Tokio runtime to simplify the development of high-performance QUIC applications. The library was used internally to back the edge services, the Oxy HTTP proxies or MASQUE-based tunnels replacing the Wireguard-based tunnels in the WARP client.
-
Benchmarking beyond the Application Layer: How Uber Evaluates Infrastructure Changes and Cloud Skus
Uber’s Ceilometer framework automates infrastructure performance benchmarking beyond applications. It standardizes testing across servers, workloads, and cloud SKUs, helping teams validate changes, identify regressions, and optimize resources. Future plans include AI integration, anomaly detection, and continuous validation.