InfoQ Homepage Apache Kafka Content on InfoQ
-
Apache Kafka 3.3 Replaces ZooKeeper with the New KRaft Consensus Protocol
The Apache Software Foundation has released Apache Kafka 3.3.1 with many new features and improvements. In particular, this is the first release that marks KRaft (Kafka Raft) consensus protocol as production ready. In development for several years, it was released in early access in Kafka 2.8, then in preview in Kafka 3.0.
-
AWS Lambda Supports Event Filtering for Amazon MSK, Kafka and Amazon MQ
Amazon recently announced that AWS Lambda supports content filtering options for Amazon MSK, Self-Managed Kafka, Amazon MQ for Apache ActiveMQ, and Amazon MQ for RabbitMQ as event sources. The new options extend the filtering to data store and broker services and reduce traffic to Lambda functions, simplifying application logic and reducing costs.
-
Netflix Builds a Custom High-Throughput Priority Queue Backed by Redis, Kafka and Elasticsearch
Netflix recently published how it built Timestone, a custom high-throughput, low-latency priority queueing system. They built it using open-source components such as Redis, Apache Kafka, Apache Flink and Elasticsearch. Engineers state that they made Timestone since they could not find an off-the-shelf solution that met all of its requirements.
-
Grab Shared Its Experience in Designing Distributed Data Platform
GrabApp is an application that customers select and buy their daily needs from merchants. To be scalable and manageable the data platform and ingestion should be designed as a distributed, fault-tolerant. To design this data platform two classes of data stores are considered: OLTP and OLAP.
-
Confluent Introduces Stream Governance Advanced to Safely Extend Data Streaming Power
Confluent recently announced new enhancements to its Stream Governance product that will improve engineering teams' ability to discover, understand, and trust real-time data. Organizations can use Stream Governance Advanced to resolve issues within complex pipelines more easily with point-in-time lineage.
-
Confluent Ships Stream Designer Democratizing Data Streams
Confluent recently released Stream Designer, a visual interface that lets developers quickly build and deploy streaming data pipelines.
-
Next Generation of Data Movement and Processing Platform at Netflix
Netflix engineering recently published in a tech blog how they used data mesh architecture and principles as the next generation of data platform and processing to unleash more business use cases and opportunities. Data mesh is the new paradigm shift in data management that enables users to easily import and use data without transporting it to a centralized location like a data lake.
-
Fitting Presto to Large-Scale Apache Kafka at Uber
The need for ad-hoc real-time data analysis has been growing at Uber. They run a large Apache Kafka deployment and need to analyse data going through the many workflows it supports. Solutions like stream processing and OLAP datastores were deemed unsuitable. An article was published recently detailing why Uber chose Presto for this purpose and what it had to do to make it performant at scale.
-
Amazon MSK Serverless Now Generally Available
AWS recently announced that Amazon MSK Serverless is now generally available. The serverless option to manage an Apache Kafka cluster removes the need to monitor capacity and automatically balances partitions within a cluster.
-
Netflix Studio Search: Using Elasticsearch and Apache Flink to Index Federated GraphQL Data
Netflix engineers recently published how they built Studio Search, using Apache Kafka streams, an Apache Flink-based Data Mesh process, and Elasticsearch to manage the index. They designed the platform to take a portion of Netflix's federated GraphQL graph and make it searchable. Today, Studio Search powers a significant portion of the user experience for many applications within the organisation.
-
Kestra: a Scalable Open-Source Orchestration and Scheduling Platform
Kestra, a new open-source orchestration and scheduling platform, helps developers to build, run, schedule, and monitor complex pipelines. The concept of a workflow, called Flow in Kestra, is at the heart of the platform. It is a list of tasks defined with a descriptive language based on yaml.
-
Real-Time Exactly-Once Event Processing at Uber with Apache Flink, Kafka, and Pinot
Uber faced some challenges after introducing ads on UberEats. The events they generated had to be processed quickly, reliably and accurately. These requirements were fulfilled by a system based on Apache Flink, Kafka, and Pinot that can process streams of ad events in real-time with exactly-once semantics. An article describing its architecture was published recently in the Uber Engineering blog.
-
Data Collection, Standardization and Usage at Scale in the Uber Rider App
Uber Engineering recently published how it collects, standardises and uses data from the Uber Rider app. Rider data comprises all the rider's interactions with the Uber app. This data accounts for billions of events from Uber's online systems every day. Uber uses this data to deal with top problem areas such as increasing funnel conversion, user engagement, etc.
-
Netflix Builds a Reliable, Scalable Platform with Event Sourcing, MQTT and Alpakka-Kafka
Netflix recently published a blog post detailing how it built a reliable device management platform using an MQTT-based event sourcing implementation. To scale its solution, Netflix utilizes Apache Kafka, Alpakka-Kafka and CockroachDB.
-
Microsoft Announces Event Hubs Premium in Preview
Azure Event Hubs is Microsoft’s managed real-time event ingestion service designed to serve demanding big data streaming and event ingestion needs in the Cloud. Microsoft announced the public preview of Event Hubs Premium during the annual Build conference as a new product SKU tailor-made for high-end event streaming scenarios requiring elastic, superior performance with predictable latency.