Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Event Streams and Workflow Engines – Kafka and Zeebe

Event Streams and Workflow Engines – Kafka and Zeebe

This item in japanese

Lire ce contenu en français


Apache Kafka is a highly scalable distributed streaming platform often used to distribute messages or events within a microservices based system. These events are sometimes part of a business process with tasks spread over several microservices. To handle complex business processes a workflow engine can be used, but to match Kafka it must meet the same scalability Kafka provides. Zeebe is a workflow engine currently developed and designed to meet these scalability requirements. In a joint meeting in Amsterdam, Kai Waehner described features of Kafka and how it fits in an Event-Driven Architecture (EDA), and Bernd Rücker described workflow engines, Zeebe and how it can be used with Kafka.

For Waehner, technology evangelist at Confluent, one reason so many use Kafka today is that more and more applications, microservices, mobile apps and IoT devices are integrated, which provide much more data. We have to process more messages than before and at an increased speed, often with real-time use cases. We started many years ago with building point-to-point integrations but that doesn’t scale very well and is hard to maintain. About 10 years ago we started to use an Enterprise Service Bus (ESB) for integration. Today the ESB is replaced with a message streaming platform like Kafka which all applications are connected to.

EDA is not a new idea; the concept has been around for at least 10 – 20 years. The new thing lies in how we can process data. Instead of storing data in a database which some other service reads and processes, the data now flows and is processed continuously. An important differentiator for an event streaming platform is that it doesn’t use a static store like an SQL or NoSQL database; instead events or other messages are stored. This affects how you build applications; now events are published which are then consumed by other applications.

Waehner points out that Kafka is about three concepts; messaging, storage and processing of data. It’s a messaging broker like many other brokers, it’s a storage system where data can be stored as long as you want, and finally it can process data. Two important features that Kafka shares with databases are:

  • Strict ordering of messages, which is very important in many use cases
  • Persistence; all messages are stored on disk which means that it can crash without losing data

Another key component of Kafka is that it’s distributed by design and built for failures. Main concepts include replication, fault tolerance, partitioning and elastic scaling.

In Waehner’s experience many developers see Kafka only as a messaging platform, and therefore points out that there are two more components included:

  • Kafka Connect, an integration framework on top of core Kafka; examples of connectors include many databases and messaging systems
  • Kafka Streams for stream processing, which for Waehner is the easiest way to process data

Waehner concludes by noting that more and more he is seeing that Kafka is used when building core business applications — Kafka is operating the business. Analysing the business is still important but it's just a small part. Another trend he sees is that most of his customers are hybrid; they build new systems in the cloud but still have systems on premise, and they all need to communicate.

Bernd Rücker

Rücker, co-founder and developer advocate at Camunda, sees a clear trend towards the use of microservices during the last years among his customers. He has also noted a trend where some customers have started using an event-driven approach, and are now using it for everything.

Using the event notification pattern, systems are built with microservices responsible for different parts of the business. The services publish events to notify other services about things happening. To accomplish a business function, several services may be involved, sending events to each other. Rücker calls this peer-to-peer event chains and notes that one problem is that it can be hard to get a picture of the overall flow from a business perspective. He refers to an article by Martin Fowler where he points out that although the event notification pattern can be useful, it also adds a risk of losing sight of the larger-scale flow.

One approach to regain a view of the flow of events is using monitoring and tracing. In an article on InfoQ, Rücker describes examples of how this can be done. Process tracking is the way he prefers because by modelling the workflow and using a workflow engine listening to all events, it’s possible to verify that each event flow works correctly from a business perspective and to get notifications if they fail.

Rücker notes that process tracking is easy to apply since there is no need to change anything; just attach the workflow engine to the Kafka infrastructure. This can also be a first step into acting on potential failures, for example by adding timeouts and warnings when a process takes too long time to complete. He refers to a presentation by Vodafone about how they replaced an existing middleware, first by using tracking and then step-by-step replaced every task with orchestration.

A potential problem with a peer-to-peer event chain is when the work flow needs to change — this may require that several services must change their event subscriptions. This also requires a coordination between teams and in deployment of the services, as well as a consideration of ongoing workflows and active events in the system. To ensure that business processes are fulfilled, Rücker prefers to extract the end-to-end responsibility into one service. One advantage is that you will have one service being responsible for something that is very important for a company, and one single point where you control the sequence of tasks. This also gives the possibility to start using commands to control the workflow. Commands are orchestration — you tell someone to do something — but Rücker points out that the orchestration is an internal part of a microservice, not a central infrastructure mechanism every service uses. He also points out that with one service responsible for a workflow, there is one point where you can check for the state of running orders, number of successful orders, and so on.

Camunda is currently working on Zeebe, a horizontally scalable workflow engine for microservices, which makes it suitable for low latency and high throughput use cases in combination with Kafka. It’s still in developer preview state, but they have pilot customers that run 100-200 k workflow instances/s. According to Rücker, a production-ready version is planned for July 2019.

Both Wahner’s slides and Rücker’s slides from the presentations are available.

In an interview with InfoQ, Rücker notes that he thinks it’s a bit odd when order fulfilment, which is very important for a company, is not handled in the core domain, but must be done by monitoring of events to try to detect if any order is stuck somewhere. For him, an order fulfilment service should be concerned about orders fulfilled, not just publish an event that an order has been created and hope that other services make sure payment is handled and the goods delivered to the customer.

We commonly talk about event streams, but Rücker thinks we instead should talk about record or message streams and emphasizes that the term in the Kafka API is records, not events. For him Kafka can be used for different types of messages, like events and commands, and he refers to the seminal book Enterprise Integration Patterns written by Gregor Hohpe and Bobby Woolf, where Command, Document and Event messages are described.

Rate this Article