InfoQ Homepage Apache Kafka Content on InfoQ

Articles

RSS Feed

Newer Older

AI, ML & Data Engineering

Architecting Cloud-Native Kafka: from Tiered Storage towards a Diskless Future

This article explores Kafka's transition toward a cloud-native architecture, examining how tiered storage, FinOps telemetry, elastic consumer scaling, virtual clusters, and Share Groups reshape the operational and economic model of event streaming platforms. It also analyzes emerging diskless-storage proposals and their architectural trade-offs.

Viquar Khan
on May 26, 2026
Java

The Schema Proliferation Problem in Kafka and Flink Pipelines: How to Solve It

Schema proliferation builds slowly and gets expensive fast. One schema per event type feels right until there are ten tables, union queries spanning all of them, and a single field rename touching every schema. Discriminator-based schema consolidation collapses that to two tables, turning multi-table unions into a single query, while new variants are additive and don't break existing consumers.

Spoorthi Basu
on May 25, 2026
DevOps

Analyzing Apache Kafka Stretch Clusters: WAN Disruptions, Failure Scenarios, and DR Strategies

Proficient in analyzing the dynamics of Apache Kafka Stretch Clusters, I assess WAN disruptions and devise effective Disaster Recovery (DR) strategies. With deep expertise, I ensure high availability and data integrity across multi-region deployments. My insights optimize operational resilience, safeguarding vital services against service level agreement violations.

Srikanth Daggumalli Nishchai Jayanna Manjula
on Jun 20, 2025
Architecture & Design

Beyond Trends: A Practical Guide to Choosing the Right Message Broker

Choosing the right message broker for your application requires matching the appropriate technology with the messaging patterns needed. Message brokers can be broadly categorized as either stream-based or queue-based, each offering unique strengths and trade-offs.

Nehme Bilal
on Mar 19, 2025
Architecture & Design

Managing 238M Memberships at Netflix

In this article Surabhi Diwan shared how the Netflix membership team does distributed systems: the architecture bets, technology choices, and operational semantics that serve the needs of Netflix’s ever-growing member base.

Surabhi Diwan
on Mar 25, 2024
.NET

Building Kafka Event-Driven Applications with KafkaFlow

KafkaFlow, a .NET open-source project, simplifies Kafka-based event-driven app development with features like middleware for message processing, enhancing maintainability, customization potential, and allowing developers to prioritize business logic.

Guilherme Ferreira
on Sep 08, 2023
Architecture & Design

Tales of Kafka at Cloudflare: Lessons Learnt on the Way to 1 Trillion Messages

Cloudflare uses Kafka clusters to decouple microservices and communicate the creation, change or deletion of various resources via protobuf, a common data format in a fault-tolerant manner. The authors suggest investing in metrics for problem detection, prioritizing clear SDK documentation, and balancing flexibility and simplicity for standardized pipelines.

Matt Boyle Andrea Medda
on May 29, 2023
Java

Article Series: Developing Apache Kafka applications on Kubernetes

Apache Kafka has integrations with most of the languages used these days, but in this article series, we cover its integration with Java. In this series, we also discuss how to provision, configure and secure an Apache Kafka cluster on a Kubernetes cluster.

Alex Soto
on Feb 06, 2023
Java

Securing a Kafka Cluster in Kubernetes Using Strimzi

Deploying an Apache Kafka cluster to Kubernetes is easy if you use Strimzi, but that’s only the first step; you need to secure the communication between Kafka and the consumers and producers, provide RBAC to access topics, spread the secrets correctly to Kafka Connect components and all using a Kubernetes GitOps way.

Alex Soto
on Dec 30, 2022
Architecture & Design

Building & Operating High-Fidelity Data Streams

At QCon Plus 2021 last November, Sid Anand, chief architect at Datazoom and PMC Member at Apache Airflow, presented on building high-fidelity nearline data streams as a service within a lean team. In this talk, Anand provides a master class on building high-fidelity data streams from the ground up.

Sid Anand
on Sep 30, 2022
Java

Moving Kafka and Debezium to Kubernetes Using Strimzi - the GitOps Way

Deploying an Apache Kafka cluster to a Kubernetes is not an easy task. There are a lot of pieces to configure like the zookeeper, the Kafka cluster, topics, and users. Strimzi is a Kubernetes controller making the deployment process of Kafka a child game. Moreover, Strimzi lets you manage Kafka using GitOps methodology as everything is executed using a Kubernetes YAML file.

Alex Soto
on Sep 28, 2022
Java

Debezium and Quarkus: Change Data Capture Patterns to Avoid Dual-Writes Problems

It’s common in microservices to write data in two places, a database and then send the content to another microservice. One approach to tackle this problem is dual writes, but you may lose data because of concurrent writes. Debezium is an open-source project for change data capture using the log scanner approach to avoid dual writes and communicate persisted data correctly between services.

Alex Soto
on Aug 15, 2022

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles