InfoQ Homepage Apache Kafka Content on InfoQ
-
Grab Adds Real-Time Data Quality Monitoring to Its Platform
Grab updated its internal platform to monitor Apache Kafka data quality in real time. The system uses FlinkSQL and an LLM to detect syntactic and semantic errors. It currently tracks 100+ topics, preventing invalid data from reaching downstream users. This proactive strategy aligns with industry trends to treat data streams as reliable products.
-
Karrot Improves Conversion Rates by 70% with New Scalable Feature Platform on AWS
Karrot replaced its legacy recommendation system with a scalable architecture that leverages various AWS services. The company sought to address challenges related to tight coupling, limited scalability, and poor reliability in its previous solution, opting instead for a distributed, event-driven architecture built on top of scalable cloud services.
-
Spring News Roundup: GA Releases of Boot, Security, GraphQL, Integration, Modulith, Batch
Following the much anticipated release of Spring Framework 7.0, there was a flurry of activity in the Spring ecosystem during the week of November 17th, 2025, highlighting additional GA releases of Spring Boot, Spring Security, Spring for GraphQL, Spring Integration, Spring Modulith, Spring REST Docs and Spring Batch.
-
From Outages to Order: Netflix’s Approach to Database Resilience with WAL
Netflix uses a Write-Ahead Log (WAL) system to improve data platform resilience, addressing data loss, replication entropy, multi-partition failures, and corruption. WAL decouples producers and consumers, leverages SQS/Kafka with dead-letter queues, and supports delay queues, cross-region replication, and multi-table mutations for high-throughput, consistent, and recoverable database operations.
-
Airbnb’s Mussel V2: Next-Gen Key Value Storage to Unify Streaming and Bulk Ingestion
Airbnb’s engineering team re-architected its internal key-value storage system, Mussel, to unify streaming and bulk ingestion while simplifying operations, achieving over 100,000 writes per second and sub-25ms read latencies on 100-terabyte tables, while leveraging Kubernetes, Kafka, and a NewSQL backend to improve scalability, reliability, and operational efficiency across its internal services.
-
Spring News Roundup: Third Milestone Releases of Boot, Security, GraphQL, Integration, Modulith
There was a flurry of activity in the Spring ecosystem during the week of September 15th, 2025, highlighting third milestone releases of Spring Boot, Spring Security, Spring for GraphQL, Spring Integration, Spring Modulith, Spring REST Docs, Spring Batch and Spring for Apache Pulsar. There were also resolutions to CVEs in Spring Framework and Spring Security.
-
PagerDuty's Kafka Outage Silences Alerts for Thousands of Companies
PagerDuty, the incident management platform used by thousands of organisations to alert them to problems on their systems, suffered a major outage itself on 28th August, 2025. In a comprehensive outage report, the company detailed the scope of the problem, the customer impact, and how it is working to prevent a recurrence.
-
Java News Roundup: OpenJDK, TornadoVM, Payara Platform, Apache Kafka, Grails, Micronaut
This week's Java roundup for September 1st, 2025, features news highlighting: JEP 517 proposed to target for JDK 26; TornadoVM releases GPULlama3.java 0.2.0; the September 2025 edition of the Payara Platform; point releases of Quarkus, Micronaut, Apache Kafka and Apache Tomcat; and second release candidates of Grails 7.0 and Gradle 9.1.
-
Netflix Revamps Tudum’s CQRS Architecture with RAW Hollow In-Memory Object Store
Netflix replaced a CQRS implementation using Kafka and Cassandra with a new solution leveraging RAW Hollow, an in-memory object store developed internally. Revamped architecture of Tudum offers much faster content preview during the editorial process and faster page rendering for visitors.
-
Agoda Handles Kafka Consumer Failover across Data Centers with Custom Two-Way Sync
Agoda's engineering team recently shared their custom solution designed to maintain critical Kafka consumer operations across multiple on-premise data centers, ensuring business continuity even during outages.
-
DoorDash Introduces Config-Driven Badge Framework to Decouple UI Logic
DoorDash has launched a badge serving framework (BSF), a configuration-based system that decouples UI badge logic from application code. BSF allows the company to manage badges through backend configuration instead of client-side updates, enabling faster rollouts and more consistent behavior across platforms.
-
Inside Netflix’s Title Launch Observability System: Validating Title Availability at Global Scale
Netflix has developed a platform called Title Launch Observability, which shifts observability from system health to product intent. Instead of relying solely on logs and metrics, the system validates launches against what users should see, catching content quality issues early. The platform helps detect issues such as missing artwork, incorrect recommendations, or localization gaps.
-
AWS Lambda Gains Native Avro and Protobuf Support for Kafka Events with Schema Registry Integration
Lambda now natively supports Apache Avro and Protobuf events, streamlining Kafka event processing - an enhancement that eliminates the need for custom deserialization, automates schema validation and filtering, and optimizes costs through efficient event handling. Integration with AWS Glue and Confluent registries simplifies development, allowing cleaner data consumption and enhanced scalability.
-
LinkedIn Announces Northguard and Xinfra: Scaling beyond Kafka for Log Storage and Pub/Sub
LinkedIn today announced Northguard, a scalable log storage system that replaces Kafka, and Xinfra, a virtualized Pub/Sub layer. Northguard delivers sharded data & metadata, log striping, strong consistency, and self-balancing clusters at a larger scale than Kafka, while Xinfra enables seamless migration and unified access across Kafka and Northguard.
-
Spring News Roundup: Spring Vault Milestone, Point Releases and End of OSS Support
There was a flurry of activity in the Spring ecosystem during the week of June 16th, 2025, highlighting: the first milestone release of Spring Vault 4.0; and point releases of Spring Boot, Spring Security, Spring Authorization Server, Spring Modulith, Spring AMQP and Spring for Apache Kafka. Release trains for numerous Spring projects will also reach the end of OSS support on June 30, 2025.