InfoQ Homepage Apache Kafka Content on InfoQ

News

RSS Feed

Newer Older

Architecture & Design

Uber Moves In-House Search Indexing to Pull-Based Ingestion in OpenSearch

Uber transitions its in-house search indexing to OpenSearch with a pull-based ingestion framework, improving reliability, backpressure handling, and multi-region consistency for large-scale streaming data while simplifying recovery and supporting global, real-time search experiences.

Leela Kumili
on Feb 09, 2026
Architecture & Design

LinkedIn Re-Architects Service Discovery: Replacing Zookeeper with Kafka and xDS at Scale

LinkedIn's engineering team successfully upgraded its legacy ZooKeeper service discovery platform to enhance scalability and performance. By leveraging Apache Kafka and the xDS protocol, the new architecture enables eventual consistency, supports multiple languages, and allows migration without downtime. Post-upgrade, latency vastly improved, facilitating hundreds of thousands of app instances.

Patrick Farry
on Feb 05, 2026
Architecture & Design

From On-Demand to Live : Netflix Streaming to 100 Million Devices in under 1 Minute

Netflix’s global live streaming platform powers millions of viewers with cloud-based ingest, custom live origin, Open Connect delivery, and real-time recommendations. This article explores the architecture, low-latency pipelines, adaptive bitrate streaming, and operational monitoring that ensure reliable, scalable, and synchronized live event experiences worldwide.

Leela Kumili
on Dec 05, 2025
Architecture & Design

Grab Adds Real-Time Data Quality Monitoring to Its Platform

Grab updated its internal platform to monitor Apache Kafka data quality in real time. The system uses FlinkSQL and an LLM to detect syntactic and semantic errors. It currently tracks 100+ topics, preventing invalid data from reaching downstream users. This proactive strategy aligns with industry trends to treat data streams as reliable products.

Patrick Farry
on Dec 05, 2025
Architecture & Design

Karrot Improves Conversion Rates by 70% with New Scalable Feature Platform on AWS

Karrot replaced its legacy recommendation system with a scalable architecture that leverages various AWS services. The company sought to address challenges related to tight coupling, limited scalability, and poor reliability in its previous solution, opting instead for a distributed, event-driven architecture built on top of scalable cloud services.

Rafal Gancarz
on Dec 04, 2025
Java

Spring News Roundup: GA Releases of Boot, Security, GraphQL, Integration, Modulith, Batch

Following the much anticipated release of Spring Framework 7.0, there was a flurry of activity in the Spring ecosystem during the week of November 17th, 2025, highlighting additional GA releases of Spring Boot, Spring Security, Spring for GraphQL, Spring Integration, Spring Modulith, Spring REST Docs and Spring Batch.

Michael Redlich
on Nov 24, 2025
Architecture & Design

From Outages to Order: Netflix’s Approach to Database Resilience with WAL

Netflix uses a Write-Ahead Log (WAL) system to improve data platform resilience, addressing data loss, replication entropy, multi-partition failures, and corruption. WAL decouples producers and consumers, leverages SQS/Kafka with dead-letter queues, and supports delay queues, cross-region replication, and multi-table mutations for high-throughput, consistent, and recoverable database operations.

Leela Kumili
on Oct 31, 2025
Architecture & Design

Airbnb’s Mussel V2: Next-Gen Key Value Storage to Unify Streaming and Bulk Ingestion

Airbnb’s engineering team re-architected its internal key-value storage system, Mussel, to unify streaming and bulk ingestion while simplifying operations, achieving over 100,000 writes per second and sub-25ms read latencies on 100-terabyte tables, while leveraging Kubernetes, Kafka, and a NewSQL backend to improve scalability, reliability, and operational efficiency across its internal services.

Leela Kumili
on Oct 24, 2025
Java

Spring News Roundup: Third Milestone Releases of Boot, Security, GraphQL, Integration, Modulith

There was a flurry of activity in the Spring ecosystem during the week of September 15th, 2025, highlighting third milestone releases of Spring Boot, Spring Security, Spring for GraphQL, Spring Integration, Spring Modulith, Spring REST Docs, Spring Batch and Spring for Apache Pulsar. There were also resolutions to CVEs in Spring Framework and Spring Security.

Michael Redlich
on Sep 22, 2025
DevOps

PagerDuty's Kafka Outage Silences Alerts for Thousands of Companies

PagerDuty, the incident management platform used by thousands of organisations to alert them to problems on their systems, suffered a major outage itself on 28th August, 2025. In a comprehensive outage report, the company detailed the scope of the problem, the customer impact, and how it is working to prevent a recurrence.

Matt Saunders
on Sep 16, 2025
Java

Java News Roundup: OpenJDK, TornadoVM, Payara Platform, Apache Kafka, Grails, Micronaut

This week's Java roundup for September 1st, 2025, features news highlighting: JEP 517 proposed to target for JDK 26; TornadoVM releases GPULlama3.java 0.2.0; the September 2025 edition of the Payara Platform; point releases of Quarkus, Micronaut, Apache Kafka and Apache Tomcat; and second release candidates of Grails 7.0 and Gradle 9.1.

Michael Redlich
on Sep 08, 2025
Architecture & Design

Netflix Revamps Tudum’s CQRS Architecture with RAW Hollow In-Memory Object Store

Netflix replaced a CQRS implementation using Kafka and Cassandra with a new solution leveraging RAW Hollow, an in-memory object store developed internally. Revamped architecture of Tudum offers much faster content preview during the editorial process and faster page rendering for visitors.

Rafal Gancarz
on Aug 15, 2025
DevOps

Agoda Handles Kafka Consumer Failover across Data Centers with Custom Two-Way Sync

Agoda's engineering team recently shared their custom solution designed to maintain critical Kafka consumer operations across multiple on-premise data centers, ensuring business continuity even during outages.

Craig Risi
on Aug 13, 2025
Architecture & Design

DoorDash Introduces Config-Driven Badge Framework to Decouple UI Logic

DoorDash has launched a badge serving framework (BSF), a configuration-based system that decouples UI badge logic from application code. BSF allows the company to manage badges through backend configuration instead of client-side updates, enabling faster rollouts and more consistent behavior across platforms.

Leela Kumili
on Aug 08, 2025
Architecture & Design

Inside Netflix’s Title Launch Observability System: Validating Title Availability at Global Scale

Netflix has developed a platform called Title Launch Observability, which shifts observability from system health to product intent. Instead of relying solely on logs and metrics, the system validates launches against what users should see, catching content quality issues early. The platform helps detect issues such as missing artwork, incorrect recommendations, or localization gaps.

Leela Kumili
on Jul 12, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News