InfoQ Homepage Apache Kafka Content on InfoQ
-
Architecting Cloud-Native Kafka: from Tiered Storage towards a Diskless Future
This article explores Kafka's transition toward a cloud-native architecture, examining how tiered storage, FinOps telemetry, elastic consumer scaling, virtual clusters, and Share Groups reshape the operational and economic model of event streaming platforms. It also analyzes emerging diskless-storage proposals and their architectural trade-offs.
-
The Schema Proliferation Problem in Kafka and Flink Pipelines: How to Solve It
Schema proliferation builds slowly and gets expensive fast. One schema per event type feels right until there are ten tables, union queries spanning all of them, and a single field rename touching every schema. Discriminator-based schema consolidation collapses that to two tables, turning multi-table unions into a single query, while new variants are additive and don't break existing consumers.
-
Analyzing Apache Kafka Stretch Clusters: WAN Disruptions, Failure Scenarios, and DR Strategies
Proficient in analyzing the dynamics of Apache Kafka Stretch Clusters, I assess WAN disruptions and devise effective Disaster Recovery (DR) strategies. With deep expertise, I ensure high availability and data integrity across multi-region deployments. My insights optimize operational resilience, safeguarding vital services against service level agreement violations.
-
Beyond Trends: A Practical Guide to Choosing the Right Message Broker
Choosing the right message broker for your application requires matching the appropriate technology with the messaging patterns needed. Message brokers can be broadly categorized as either stream-based or queue-based, each offering unique strengths and trade-offs.
-
Managing 238M Memberships at Netflix
In this article Surabhi Diwan shared how the Netflix membership team does distributed systems: the architecture bets, technology choices, and operational semantics that serve the needs of Netflix’s ever-growing member base.
-
Building Kafka Event-Driven Applications with KafkaFlow
KafkaFlow, a .NET open-source project, simplifies Kafka-based event-driven app development with features like middleware for message processing, enhancing maintainability, customization potential, and allowing developers to prioritize business logic.
-
Tales of Kafka at Cloudflare: Lessons Learnt on the Way to 1 Trillion Messages
Cloudflare uses Kafka clusters to decouple microservices and communicate the creation, change or deletion of various resources via protobuf, a common data format in a fault-tolerant manner. The authors suggest investing in metrics for problem detection, prioritizing clear SDK documentation, and balancing flexibility and simplicity for standardized pipelines.
-
Article Series: Developing Apache Kafka applications on Kubernetes
Apache Kafka has integrations with most of the languages used these days, but in this article series, we cover its integration with Java. In this series, we also discuss how to provision, configure and secure an Apache Kafka cluster on a Kubernetes cluster.
-
Securing a Kafka Cluster in Kubernetes Using Strimzi
Deploying an Apache Kafka cluster to Kubernetes is easy if you use Strimzi, but that’s only the first step; you need to secure the communication between Kafka and the consumers and producers, provide RBAC to access topics, spread the secrets correctly to Kafka Connect components and all using a Kubernetes GitOps way.
-
Building & Operating High-Fidelity Data Streams
At QCon Plus 2021 last November, Sid Anand, chief architect at Datazoom and PMC Member at Apache Airflow, presented on building high-fidelity nearline data streams as a service within a lean team. In this talk, Anand provides a master class on building high-fidelity data streams from the ground up.
-
Moving Kafka and Debezium to Kubernetes Using Strimzi - the GitOps Way
Deploying an Apache Kafka cluster to a Kubernetes is not an easy task. There are a lot of pieces to configure like the zookeeper, the Kafka cluster, topics, and users. Strimzi is a Kubernetes controller making the deployment process of Kafka a child game. Moreover, Strimzi lets you manage Kafka using GitOps methodology as everything is executed using a Kubernetes YAML file.
-
Debezium and Quarkus: Change Data Capture Patterns to Avoid Dual-Writes Problems
It’s common in microservices to write data in two places, a database and then send the content to another microservice. One approach to tackle this problem is dual writes, but you may lose data because of concurrent writes. Debezium is an open-source project for change data capture using the log scanner approach to avoid dual writes and communicate persisted data correctly between services.