InfoQ Homepage Apache Kafka Content on InfoQ
-
Grab Shared Its Experience in Designing Distributed Data Platform
GrabApp is an application that customers select and buy their daily needs from merchants. To be scalable and manageable the data platform and ingestion should be designed as a distributed, fault-tolerant. To design this data platform two classes of data stores are considered: OLTP and OLAP.
-
Confluent Introduces Stream Governance Advanced to Safely Extend Data Streaming Power
Confluent recently announced new enhancements to its Stream Governance product that will improve engineering teams' ability to discover, understand, and trust real-time data. Organizations can use Stream Governance Advanced to resolve issues within complex pipelines more easily with point-in-time lineage.
-
Confluent Ships Stream Designer Democratizing Data Streams
Confluent recently released Stream Designer, a visual interface that lets developers quickly build and deploy streaming data pipelines.
-
Next Generation of Data Movement and Processing Platform at Netflix
Netflix engineering recently published in a tech blog how they used data mesh architecture and principles as the next generation of data platform and processing to unleash more business use cases and opportunities. Data mesh is the new paradigm shift in data management that enables users to easily import and use data without transporting it to a centralized location like a data lake.
-
Fitting Presto to Large-Scale Apache Kafka at Uber
The need for ad-hoc real-time data analysis has been growing at Uber. They run a large Apache Kafka deployment and need to analyse data going through the many workflows it supports. Solutions like stream processing and OLAP datastores were deemed unsuitable. An article was published recently detailing why Uber chose Presto for this purpose and what it had to do to make it performant at scale.
-
Amazon MSK Serverless Now Generally Available
AWS recently announced that Amazon MSK Serverless is now generally available. The serverless option to manage an Apache Kafka cluster removes the need to monitor capacity and automatically balances partitions within a cluster.
-
Netflix Studio Search: Using Elasticsearch and Apache Flink to Index Federated GraphQL Data
Netflix engineers recently published how they built Studio Search, using Apache Kafka streams, an Apache Flink-based Data Mesh process, and Elasticsearch to manage the index. They designed the platform to take a portion of Netflix's federated GraphQL graph and make it searchable. Today, Studio Search powers a significant portion of the user experience for many applications within the organisation.
-
Kestra: a Scalable Open-Source Orchestration and Scheduling Platform
Kestra, a new open-source orchestration and scheduling platform, helps developers to build, run, schedule, and monitor complex pipelines. The concept of a workflow, called Flow in Kestra, is at the heart of the platform. It is a list of tasks defined with a descriptive language based on yaml.
-
Real-Time Exactly-Once Event Processing at Uber with Apache Flink, Kafka, and Pinot
Uber faced some challenges after introducing ads on UberEats. The events they generated had to be processed quickly, reliably and accurately. These requirements were fulfilled by a system based on Apache Flink, Kafka, and Pinot that can process streams of ad events in real-time with exactly-once semantics. An article describing its architecture was published recently in the Uber Engineering blog.
-
Data Collection, Standardization and Usage at Scale in the Uber Rider App
Uber Engineering recently published how it collects, standardises and uses data from the Uber Rider app. Rider data comprises all the rider's interactions with the Uber app. This data accounts for billions of events from Uber's online systems every day. Uber uses this data to deal with top problem areas such as increasing funnel conversion, user engagement, etc.
-
Netflix Builds a Reliable, Scalable Platform with Event Sourcing, MQTT and Alpakka-Kafka
Netflix recently published a blog post detailing how it built a reliable device management platform using an MQTT-based event sourcing implementation. To scale its solution, Netflix utilizes Apache Kafka, Alpakka-Kafka and CockroachDB.
-
Microsoft Announces Event Hubs Premium in Preview
Azure Event Hubs is Microsoft’s managed real-time event ingestion service designed to serve demanding big data streaming and event ingestion needs in the Cloud. Microsoft announced the public preview of Event Hubs Premium during the annual Build conference as a new product SKU tailor-made for high-end event streaming scenarios requiring elastic, superior performance with predictable latency.
-
Confluent Announces Confluent for Kubernetes into General Availability
Recently, Confluent announced the general availability (GA) of Confluent for Kubernetes, a complete, declarative API-driven experience for deploying and self-managing Confluent Platform as a cloud-native system. With Confluent for Kubernetes, the company packages its event-streaming platform into Kubernetes and provides a Cloud-Native offering.
-
Airbnb Builds Himeji - a Scalable Centralized Authorization System
Airbnb recently described how it built Himeji, a scalable centralized authorization system. Himeji stores permissions data and performs permission checks as a central source of truth. It uses a sharded and replicated in-memory cache to improve performance and lower latencies and has served checks in production for about a year.
-
Confluent Announces Strategic Alliance with Microsoft
Confluent, the company of the founders of Apache Kafka, recently announced a new strategic alliance between them and Microsoft to enable a more integrated experience between Confluent Cloud and the Azure platform.