InfoQ Homepage Event Stream Processing Content on InfoQ

Articles

RSS Feed

Newer Older

Architecture & Design

Tales of Kafka at Cloudflare: Lessons Learnt on the Way to 1 Trillion Messages

Cloudflare uses Kafka clusters to decouple microservices and communicate the creation, change or deletion of various resources via protobuf, a common data format in a fault-tolerant manner. The authors suggest investing in metrics for problem detection, prioritizing clear SDK documentation, and balancing flexibility and simplicity for standardized pipelines.

Matt Boyle Andrea Medda
on May 29, 2023
Java

Billions of Messages Per Minute Over TCP/IP

Chronicle Wire offers an alternative way of transferring data between systems, delivering more messages, faster, than common JSON/XML approaches. This approach to data serialization improves both latency and throughput.

George Ball
on Mar 24, 2023
AI, ML & Data Engineering

Streaming-First Infrastructure for Real-Time Machine Learning

This article covers the benefits of streaming-first infrastructure for two scenarios of real-time ML: online prediction, where a model can receive a request and make predictions as soon as the request arrives, and continual learning, when machine learning models are capable of continually adapting to change in data distributions in production.

Chip Huyen
on Aug 22, 2022
Development

How to Create a Network Proxy Using Stream Processor Pipy

In this article we are going to introduce Pipy, an open-source cloud-native network stream processor. After describing its modular design, we will see how to rapidly build a high-performance network proxy to serve our specific needs. Pipy has been battle-tested and is already in use by multiple commercial clients.

Ali Naqvi
on Jan 31, 2022
AI, ML & Data Engineering

Beyond the Database, and beyond the Stream Processor: What's the Next Step for Data Management?

Databases have been around forever with the same shape: you make a request to your data and then you receive an answer. Now, stream processors came along with a different approach: data isn’t locked up, it is in motion. Understand how stream processors and databases relate and why there is an emerging new category of databases that focus on data that stays in place as well as data that moves.

Ben Stopford
on Nov 16, 2020
Architecture & Design

Real Time APIs in the Context of Apache Kafka

Events offer a Goldilocks-style approach in which real-time APIs can be used as the foundation for applications which is flexible yet performant; loosely-coupled yet efficient. Apache Kafka offers a scalable event streaming platform with which you can build applications around the powerful concept of events.

Robin Moffatt
on Oct 16, 2020
Architecture & Design

The Challenges of Building a Reliable Real-Time Event-Driven Ecosystem

Globally, there is an increasing appetite for data delivered in real time; we are witnessing the emergence of the real time API. When it comes to event-driven APIs engineers can choose between multiple different protocols. In addition to choosing a protocol, engineers also have to think about subscription models, too: server-initiated (push-based) or client-initiated (pull-based).

Matthew O'Riordan
on Jul 30, 2020
AI, ML & Data Engineering

Applied Probability - Counting Large Set of Unstructured Events with Theta Sketches

In this article, author Ronen Cohen discusses the solution to processing the event data using Theta Sketches and technologies like HBase and Kafka.

Ronen Cohen
on Jun 29, 2020
Cloud

Is Edge Computing a Thing?

Edge Computing is definitely a thing, but the computing need not occur at the edge. Instead what is needed is an ability to compute (anywhere) on streaming data from large numbers of dynamically changing devices, in the edge environment. This in turn demands an architectural pattern for stateful, distributed computing.

Simon Crosby
on Apr 17, 2020
AI, ML & Data Engineering

The Kongo Problem: Building a Scalable IoT Application with Apache Kafka

In this article, author Paul Brebner discusses the best practices for developing IoT projects using Apache Kafka and Kafka Streams technologies and how to maximize Kafka scalability.

Paul Brebner
on Feb 15, 2020
AI, ML & Data Engineering

Rethinking Flink’s APIs for a Unified Data Processing Framework

Since its very early days, Apache Flink has followed the philosophy of taking a unified approach to batch and streaming. The core building block is the “continuous processing of unbounded data streams, with batch as a special, bounded set of those streams.” Recent updates to the Flink APIs include architectural designs by the community to support batch and streaming unification in Apache Flink.

Aljoscha Krettek
on Sep 16, 2019
Architecture & Design

Increasing the Quality of Patient Care through Stream Processing

Today’s healthcare technology landscape is disaggregated and siloed. Physicians analyse patient data streams from different systems without much correlation. Even though health-tech domain is mature and rich with data, the value of it is not directed towards increasing the quality of patient care. This article presents a stream processing solution in which streams are co-related.

Nuwan Bandara Himasha Guruge Nadeeshan Gimhana
on Jan 17, 2019