Business Processes, Long-Running Services and Microservices

During recent years domain events are increasingly being discussed, but we should be discussing commands just as much, Martin Schimak explained at the recent DDD eXchange 2018 conference at Skills Matter, London, where he covered events, command and long-running services in a microservices world, and how process managers and similar tooling can help in running core business logic.

For Schimak, an independent consultant in Austria, the best thing about events is that they are facts representing something that already has happened. We are increasingly dealing with distributed systems, and with a local guarantee from within a service that something has happened we can add trust to an eventually consistent world. Events also help us decouple services and let us look at the past.

All the advantages of using events is one reason why event-driven architectures are increasingly popular, sometimes with a design that only relies on events for integrating services. This is a simplification that may be reasonable, but Schimak notes that it also creates some dangers. One example is a simple order process only consisting of order placed, payment received, goods fetched, and goods shipped events used by payment, inventory and shipment services. A simple change like fetching the goods before charging the customer will change the flow of messages which will require a change in all involved services, and this is for Schimak a coupling between services that is suboptimal.

Since events are just facts, they don’t trigger any action by themselves. When listening to events we instead need some form of policy that decides what should happen when a specific event is received. In a pure event-based system, this policy is always in the consuming service. With a command-based approach this policy might be placed in the event publishing service, but Schimak argues that often none of the services are a good fit. For him, a third option is to add a mediator that listens to specific events and decides about the following step.

With an order service added to the previous example, this service could listen to relevant events and send commands, thus coordinating the process when a customer places an order and later gets that order fulfilled. With the same change as in the example, now only the order service needs a change. Schimak notes that the logic running in this process commonly is business logic belonging to the core domain of the business.

For Schimak, commands are intents for something to happen in the future, and he defines two types of execution of a command:

An atomic transaction execution, typically with an intent to change a model; an example is a place order command that leads to an order created and an order placed event published.
A composite, long-running execution with an intent of a more business-level result, possibly needing several steps to be achieved. An example is the same place order command, but where the end result is an order fulfilled or order cancelled event.

In a request payment scenario, we should strive to achieve a valuable business result. A payment service would then publish events like payment received or payment cancelled. In Schimak’s experience we instead often expose problems that could be temporary, like failing credit card charging, and delegating to the client to deal with this. This means that we force a client to deal with the policy kind of problems that clearly are payment concerns – maybe a retry should be done later, potentially with new credit card data. If the client is an order service, it must besides handling orders also deal with payments, thus spreading the payment domain knowledge out of the payment service. This also increases the size and complexity of the order service.

Delegating our problems to our clients forces them to deal with the mitigation. They become god services.

Instead we should see payment as a long-running service dealing with all internal problems related to payment, and only publish events related to the end result – payment received or payment cancelled. Schimak emphasizes that this is not about creating a central coordinator to take care of the whole business, it’s more about a good API design that helps in protecting different bounded contexts from each other.

A common tool when working with long-running services is a Process Manager. Typical requirements for a process manager are the handling of time and timeouts, retries, and compensation when a process has failed. We can implement all this ourselves, but Schimak prefers to use a framework like Axon messaging and Saga management, or Lagom. He also suggests looking into using some form of business process execution engine, but emphasizes that the tooling must be lightweight and usable within a single service. Examples of open source process engine frameworks include Activiti, Camunda and Zeebee, (also from Camunda). In the serverless space, AWS has created Step Functions, and other cloud vendors are also moving in this direction.

Schimak’s personal experience with long-running services and business process engines includes several years of using Camunda in the order fulfilment process at Zalando. He has also together with Bernd Rücker from Camunda written two articles, Events, Flows and Long-Running Services: A Modern Approach to Workflow Automation and Know the Flow! Microservices and Event Choreographies on InfoQ.

InfoQ Software Architects' Newsletter

Follow us on

Rate this Article

This content is in the Event Driven Architecture topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter