Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News High Scalability Workflow Engine Zeebe is Production Ready

High Scalability Workflow Engine Zeebe is Production Ready

This item in japanese


Zeebe is a workflow engine from Camunda, designed to meet the scalability requirements of high-performance applications running on cloud-native software architectures and to support workflows that span multiple microservices in low latency, high-throughput scenarios. Zeebe can operate in an event-driven architecture (EDA) through its support for message events. Zeebe's new release, 0.20.0, has just been released as a free community edition and is considered production ready.

According to Michael Winters, product manager at Camunda, production ready means that the new version can handle orchestration use cases in a reliable way. The requirements for labelling the release as production ready include:

  • Support of business logic via BPMN elements for most microservices orchestration use cases
  • Can be scaled horizontally on a Kubernetes cluster
  • Is fault tolerant; no data is lost if a node goes down, and will recover when such a failure does occur
  • Works with cloud-native components such as Docker, Kubernetes, and Apache Kafka
  • Provides workflow data for monitoring, troubleshooting and auditing

Traditionally, workflow engines store current state of a workflow instance in a database. Zeebe instead uses event sourcing and stores all state changes as immutable events in an append-only event log. A projection of the current state of a workflow is stored as a snapshot using RocksDB. Both the log and snapshots are stored on disk, and this is currently the only option. Other options are discussed, but none are currently on the roadmap.

To achieve the fault tolerance, resilience and horizontal scalability required, Zeebe is built as a distributed system without any central component or database. Multiple Zeebe brokers can be setup in a peer-to-peer cluster which uses the Gossip protocol to route data within the cluster. For replication of the event log, the Raft consensus algorithm is used. To scale out, partitioning can be used with every partition using a separate event log.

Zeebe runs as a separate instance on a Java virtual machine (JVM), which is a remote engine approach with applications communicating with Zeebe over the network. To maintain performance, streaming into the client and a binary protocol (gRPC) are used. This approach creates a defined setup and environment for Zeebe, and provides isolation from application code,

Zeebe does not implement any ACID transaction protocol. To mitigate this and handle the failures that may occur when a task from within a workflow is executed, there are two options. By notifying Zeebe after the job is completed, at least once is achieved — Zeebe will be aware of a failure and execute the task again. By notifying Zeebe before the job is completed, at most once is achieved — Zeebe is unaware of any failure and the task will not be executed again. In an event-driven architecture, workflows can subscribe to external messages and Zeebe supports exactly-once when processing these messages.

Since Zeebe uses event sourcing, it cannot easily handle queries to find, for example workflow instances, with problems. Instead, Exporters and the concept of CQRS are used. An exporter can access the event stream and create projections of the data, and this technique is used by Operate, a tool that comes with Zeebe for monitoring and troubleshooting workflows.

Lately, there have been discussions about open source versus source-available licenses, where companies behind open source software restrict their licenses to prevent as-a-service providers from providing managed service offerings. Zeebe is distributed under The Zeebe Community License, a comparable source-available license, which does not allow for commercial offerings of a workflow service that uses Zeebe. The tool Operate is still in preview and comes with a developer license for free, non-production use. It is also possible to get enterprise support for both products.

The last developer preview release, 0.18.0, was feature complete, but the team behind Zeebe knows there will be a lot to learn from coming production deployments. To distinguish between production ready and finished, they therefore decided to call the new release 0.20.0, instead of 1.0.0.

In two blog posts, Bernd Rücker, co-founder and chief technologist of Camunda, describes the basics and main concepts of Zeebe and how they built a highly scalable distributed state machine.

To get help in understanding the main concepts of Zeebe and get a workflow up and running, a getting started tutorial is available.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Enterprise Service Bus (ESB) 2.0

    by Kelvin Meeks,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    well, of course.
    SOA never died...

  • Re: Enterprise Service Bus (ESB) 2.0

    by Jan Stenberg,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    The most significant problem with an Erroneous Spaghetti Box is for me the central hub with a mix of transformations, communications, and business logic in the wrong place. Newer workflow engines like Activiti, Camunda, NServiceBus and Zeebe adopt the idea of "smart endpoints, dumb pipes". They are part of the application with the business logic in the right place, you can unit test the business logic, etc. I think it's a huge improvement.

  • Human Task Support?

    by Gavin Siller,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Or only straight through processes?

  • Re: Human Task Support?

    by Bernd Ruecker,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Zeebe focuses on STP at the moment.
    But as you can easily configure activities with task headers (, you can build a generic job worker ( that pushes the activity to a tasklist and let Zeebe know whenever the task is completed.
    We might add more out-of-the-box human task management support later to Zeebe, but for now we focus on the microservices orchestration use cases and thus STP.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p