InfoQ Homepage Distributed Systems Content on InfoQ
-
Slack Outlines Four-Phase Journey to a Multi-Cloud AI Serving Platform
Slack has outlined how its AI serving infrastructure evolved through four distinct phases, moving from a self-managed Amazon SageMaker deployment to a multi-cloud architecture spanning AWS Bedrock and Google Cloud Vertex AI.
-
Inside Atlassian’s Forge Billing Architecture for Distributed Usage Tracking at Scale
Atlassian details the Forge billing platform built for usage-based pricing across its cloud ecosystem. It processes large-scale usage events with correct attribution, deduplication, and aggregation using a streaming pipeline, idempotent processing, and layered storage to enable accurate billing, near real-time visibility, and reliable reconciliation across distributed services.
-
Behind the Scenes: Block 450 JVM Repositories into Monorepo to Reduce Dependency Drift
Block, Inc. describes migrating ~450 JVM repositories into a monorepo across Cash App and Square engineering to reduce dependency drift and coordination overhead. The system supports ~8,800 weekly builds with ~10 min p90 CI time. The approach improves cross-service changes, build visibility, and developer experience through dependency graph–based builds, selective CI, and custom IDE tooling.
-
Lyft Uses Mapping Intelligence to Reduce Friction in Gated Community Pickups
Lyft details a new pickup experience to improve reliability in gated communities, where 25–30% of rides face routing and access challenges. The system uses mapping signals, boundary detection, and routing improvements to reduce cancellations and coordination overhead between riders and drivers, highlighting how real-world constraints drive evolution in geospatial systems.
-
Pinterest Uses Content Fingerprints for URL Deduplication across Millions of Domains
Pinterest introduced MIQPS, a URL normalization system that identifies which query parameters affect page identity using rendered content fingerprints. It reduces duplicate processing across millions of domains by replacing rule-based approaches with offline analysis, anomaly detection, and runtime parameter maps, improving ingestion efficiency and scalability in large-scale content pipelines.
-
30+ Updates per Second per Account: Uber Scales Ledger Processing with Batching
Uber introduced a high-throughput financial ledger processing system designed to handle hot account write contention at scale. Using 250ms batching, Redis coordination, and optimistic atomic updates, the system supports 30+ updates per second per account while preserving consistency and auditability, reducing multi-hour processing pipelines to minutes in its distributed accounting infrastructure.
-
Inside Google’s System for Coordinated A/B Testing across its Global Service Fleet
Google has shared details of its fleet wide large scale A/B experimentation system designed to standardize experiment assignment, exposure logging, and configuration propagation across distributed services. The approach enables consistent measurement across products, reduces experiment conflicts, and improves reliability of data driven decision making at scale.
-
Shopify Reports 15X Faster Graphql Execution with Breadth First Engine
Shopify introduced GraphQL Cardinal, a new execution engine replacing depth-first traversal with breadth-first execution. The redesign improves large-scale GraphQL performance with up to 15x faster field execution, 6x lower GC overhead, and +4s P50 latency gains. It focuses on execution-layer efficiency and batched resolver processing for high-cardinality commerce queries.
-
Uber Improves Restaurant Recommendations Using Real-Time Signals and Listwise Ranking
Uber updates its Uber Eats Home Feed recommendation system using near real-time user sequence features and a Generative Recommender model. The system evolves from hand-crafted features to transformer-based sequence modeling, reduces feature freshness from 24 hours to seconds, and shifts from pointwise scoring to listwise GenRec for improved contextual ranking and real-time personalization.
-
Airbnb Implements Context-Aware Identity Model to Support Privacy-First Social Features
Airbnb has redesigned its identity system to support privacy-first social features in Experiences. The platform introduces context-specific profiles that separate global user identity from externally visible profiles, preventing cross-context linkage. The migration leveraged automated auditing, manual validation, and AI-assisted refactoring to enforce correct identity usage across services.
-
Netflix Serves 84% of Query Results from Cache with Interval-Aware Caching in Apache Druid
Netflix improves Apache Druid performance with interval aware caching, serving 84% of analytics results from cache and reducing query load by 33%. The system decomposes rolling window queries into reusable time segments, enabling partial cache reuse and recomputation only for recent data. At scale, it reduces scan volume, improves P90 latency, and optimizes real time analytics workloads.
-
OpenAI Introduces Websocket-Based Execution Mode to Reduce Latency in Agentic Workflows
OpenAI introduces a WebSocket-based execution mode for its Responses API to improve agentic workflow performance in coding agents and real-time AI systems. The update reduces latency by up to 40 percent by replacing HTTP request-response cycles with persistent connections, improving streaming, tool execution, and multi-step orchestration in production-scale AI systems.
-
Designing Memory for AI Agents: inside Linkedin’s Cognitive Memory Agent
LinkedIn introduces Cognitive Memory Agent (CMA), generative AI infrastructure layer enabling stateful, context-aware systems. It provides persistent memory across episodic, semantic, and procedural layers, supporting multi-agent coordination, retrieval, and lifecycle management. CMA addresses LLM statelessness and enables production-grade personalization and long-term context in AI applications.
-
Pinterest Reduces Spark OOM Failures by 96% through Auto Memory Retries
Pinterest Engineering cut Apache Spark out-of-memory failures by 96% using improved observability, configuration tuning, and automatic memory retries. Staged rollout, dashboards, and proactive memory adjustments stabilized data pipelines, reduced manual intervention, and lowered operational overhead across tens of thousands of daily jobs.
-
Discord Engineers Add Distributed Tracing to Elixir's Actor Model without Performance Penalty
Discord engineering detailed how they added distributed tracing to Elixir's actor model. Their custom Transport library wraps messages with trace context and uses dynamic sampling to handle million-user fanouts. CPU optimizations included skipping unsampled traces and filtering context before deserialization, recovering 10+ percentage points of overhead.