InfoQ Homepage Apache Kafka Content on InfoQ

News

RSS Feed

Newer Older

Architecture & Design

Grab Adds Real-Time Data Quality Monitoring to Its Platform

Grab updated its internal platform to monitor Apache Kafka data quality in real time. The system uses FlinkSQL and an LLM to detect syntactic and semantic errors. It currently tracks 100+ topics, preventing invalid data from reaching downstream users. This proactive strategy aligns with industry trends to treat data streams as reliable products.

Patrick Farry
on Dec 05, 2025
Architecture & Design

Karrot Improves Conversion Rates by 70% with New Scalable Feature Platform on AWS

Karrot replaced its legacy recommendation system with a scalable architecture that leverages various AWS services. The company sought to address challenges related to tight coupling, limited scalability, and poor reliability in its previous solution, opting instead for a distributed, event-driven architecture built on top of scalable cloud services.

Rafał Gancarz
on Dec 04, 2025
Java

Spring News Roundup: GA Releases of Boot, Security, GraphQL, Integration, Modulith, Batch

Following the much anticipated release of Spring Framework 7.0, there was a flurry of activity in the Spring ecosystem during the week of November 17th, 2025, highlighting additional GA releases of Spring Boot, Spring Security, Spring for GraphQL, Spring Integration, Spring Modulith, Spring REST Docs and Spring Batch.

Michael Redlich
on Nov 24, 2025
Architecture & Design

From Outages to Order: Netflix’s Approach to Database Resilience with WAL

Netflix uses a Write-Ahead Log (WAL) system to improve data platform resilience, addressing data loss, replication entropy, multi-partition failures, and corruption. WAL decouples producers and consumers, leverages SQS/Kafka with dead-letter queues, and supports delay queues, cross-region replication, and multi-table mutations for high-throughput, consistent, and recoverable database operations.

Leela Kumili
on Oct 31, 2025
Architecture & Design

Airbnb’s Mussel V2: Next-Gen Key Value Storage to Unify Streaming and Bulk Ingestion

Airbnb’s engineering team re-architected its internal key-value storage system, Mussel, to unify streaming and bulk ingestion while simplifying operations, achieving over 100,000 writes per second and sub-25ms read latencies on 100-terabyte tables, while leveraging Kubernetes, Kafka, and a NewSQL backend to improve scalability, reliability, and operational efficiency across its internal services.

Leela Kumili
on Oct 24, 2025
Java

Spring News Roundup: Third Milestone Releases of Boot, Security, GraphQL, Integration, Modulith

There was a flurry of activity in the Spring ecosystem during the week of September 15th, 2025, highlighting third milestone releases of Spring Boot, Spring Security, Spring for GraphQL, Spring Integration, Spring Modulith, Spring REST Docs, Spring Batch and Spring for Apache Pulsar. There were also resolutions to CVEs in Spring Framework and Spring Security.

Michael Redlich
on Sep 22, 2025
DevOps

PagerDuty's Kafka Outage Silences Alerts for Thousands of Companies

PagerDuty, the incident management platform used by thousands of organisations to alert them to problems on their systems, suffered a major outage itself on 28th August, 2025. In a comprehensive outage report, the company detailed the scope of the problem, the customer impact, and how it is working to prevent a recurrence.

Matt Saunders
on Sep 16, 2025
Java

Java News Roundup: OpenJDK, TornadoVM, Payara Platform, Apache Kafka, Grails, Micronaut

This week's Java roundup for September 1st, 2025, features news highlighting: JEP 517 proposed to target for JDK 26; TornadoVM releases GPULlama3.java 0.2.0; the September 2025 edition of the Payara Platform; point releases of Quarkus, Micronaut, Apache Kafka and Apache Tomcat; and second release candidates of Grails 7.0 and Gradle 9.1.

Michael Redlich
on Sep 08, 2025
Architecture & Design

Netflix Revamps Tudum’s CQRS Architecture with RAW Hollow In-Memory Object Store

Netflix replaced a CQRS implementation using Kafka and Cassandra with a new solution leveraging RAW Hollow, an in-memory object store developed internally. Revamped architecture of Tudum offers much faster content preview during the editorial process and faster page rendering for visitors.

Rafał Gancarz
on Aug 15, 2025
DevOps

Agoda Handles Kafka Consumer Failover across Data Centers with Custom Two-Way Sync

Agoda's engineering team recently shared their custom solution designed to maintain critical Kafka consumer operations across multiple on-premise data centers, ensuring business continuity even during outages.

Craig Risi
on Aug 13, 2025
Architecture & Design

DoorDash Introduces Config-Driven Badge Framework to Decouple UI Logic

DoorDash has launched a badge serving framework (BSF), a configuration-based system that decouples UI badge logic from application code. BSF allows the company to manage badges through backend configuration instead of client-side updates, enabling faster rollouts and more consistent behavior across platforms.

Leela Kumili
on Aug 08, 2025
Architecture & Design

Inside Netflix’s Title Launch Observability System: Validating Title Availability at Global Scale

Netflix has developed a platform called Title Launch Observability, which shifts observability from system health to product intent. Instead of relying solely on logs and metrics, the system validates launches against what users should see, catching content quality issues early. The platform helps detect issues such as missing artwork, incorrect recommendations, or localization gaps.

Leela Kumili
on Jul 12, 2025
Cloud

AWS Lambda Gains Native Avro and Protobuf Support for Kafka Events with Schema Registry Integration

Lambda now natively supports Apache Avro and Protobuf events, streamlining Kafka event processing - an enhancement that eliminates the need for custom deserialization, automates schema validation and filtering, and optimizes costs through efficient event handling. Integration with AWS Glue and Confluent registries simplifies development, allowing cleaner data consumption and enhanced scalability.

Steef-Jan Wiggers
on Jun 27, 2025
Architecture & Design

LinkedIn Announces Northguard and Xinfra: Scaling beyond Kafka for Log Storage and Pub/Sub

LinkedIn today announced Northguard, a scalable log storage system that replaces Kafka, and Xinfra, a virtualized Pub/Sub layer. Northguard delivers sharded data & metadata, log striping, strong consistency, and self-balancing clusters at a larger scale than Kafka, while Xinfra enables seamless migration and unified access across Kafka and Northguard.

Eran Stiller
on Jun 25, 2025
Java

Spring News Roundup: Spring Vault Milestone, Point Releases and End of OSS Support

There was a flurry of activity in the Spring ecosystem during the week of June 16th, 2025, highlighting: the first milestone release of Spring Vault 4.0; and point releases of Spring Boot, Spring Security, Spring Authorization Server, Spring Modulith, Spring AMQP and Spring for Apache Kafka. Release trains for numerous Spring projects will also reach the end of OSS support on June 30, 2025.

Michael Redlich
on Jun 23, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News