Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Day Two Problems When Using CQRS and Event Sourcing

Day Two Problems When Using CQRS and Event Sourcing

Leia em Português


There are a lot of good reasons for building a CQRS and event-sourcing based system, and some benefits are only achievable in such systems, but there are also problems that appear only after an application is in production (day two problems). In a presentation at the recent Event-driven Microservices Conference held by AxonIQ, Joris Kuipers shared his experience running and evolving CQRS and event sourced applications in production.

For Kuipers, CTO at Trifork Amsterdam, one problem he sometimes sees is that people start to build this kind of system too early, before they really understand the domain. Later, they gain new insights which requires models to change. However, you now have events that map to certain aggregates which don’t fit in new models. This requires a refactoring that can be very hard to implement.

If you are building a traditional system that only stores current state, it’s often quite easy to change state during an upgrade; commonly, the only thing needed is to run some SQL statements during deployment. When updating a CQRS and event sourced application, this doesn’t work. Typically, new functionality implemented will publish new event types and new versions of events. To minimize problems in such scenarios, Kuipers refers to Postel’s law, a robustness principle that states:

Be conservative in what you send, be liberal in what you accept.

For an event sourced application, this means that you should be conservative and consistent when emitting events, but liberal in what you accept; avoid validation against schemas and allow information unknown to you. Kuipers emphasizes that being liberal also includes your own events, both from the past and the future. It’s very important that both older and newer versions of an event sourced application can coexist as they evolve.

For external event processors, it often works well to ignore unknown event types and unknown fields in an event, but Kuipers points out that this is a general rule; there are exceptions, and a developer must make a conscious decision. For internal event handling, it's more complex. Both in blue-green deployments and in the case of rollbacks, when a new version is misbehaving, different versions of an event may have been stored which requires forward compatibility when reading events. In Kuiper’s experience, this is extremely hard and often means that organizations instead accept downtime. If possible, Kuipers recommends roll-forward — use small and frequent deployments, and if a problem occurs it can be fixed, and a new version deployed.

A problem for public event handlers is that events often are marshalled from specific class types into JSON and XML, and then unmarshalled back into the same types. This may cause problems in external systems for new event types and for systems built on other platforms. Using metadata and annotations are often usable to mitigate these problems.

Separating internal and external events is something Kuipers has found useful, because they have different requirements. Events that are public must be more stable and must adhere to contracts and similar agreements; fields can’t just be removed or have their meaning or type changed. Internal events are never exposed publicly which means you have full control of all parts that deal with these events, which simplifies when they must change.

With a new version of an application, some aggregates may need additional state, and for Kuipers this is handled well when CQRS and event sourcing is used. Initially, the aggregates can have a minimum of state, just enough to be able to handle the business logic. But as the need for more state increases, new state can be added to the corresponding aggregates, and due to the power of event sourcing it will look as if this state has been there since the beginning.

Snapshots are often used for optimization reasons to store current state of aggregates, but when an aggregate changes new snapshots must be created. An approach that works in systems with a small number of events is to just delete all snapshots and recreate new ones as they are required. In systems where a large amount of events have been stored, the snapshots has to been recreated up front which requires some tooling.

For new query projections, a similar approach to the one for snapshots can be used. In systems with a small number of events, the projections can be recreated by replaying all events. In systems with many events, a better approach is to update the projections. One way to do this is to use custom commands and events only used during the update phase.

Kuipers concludes by encouraging people who have experience running an event sourced system for several years to share their experience with others. What does it mean to have an application running for five years, with aggregates in use that have events since the beginning? How do you evolve, maintain, and deploy new versions of such a system?

The presentations at the conference were recorded and will be publicly available during the coming weeks.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Event sourcing

    by Daniel Lidström,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    It seems to me, similar to what you wrote, that event sourcing is best avoided when the domain is still being discovered. In my case this has been true for most domains I've worked on. If the domain is known and unlikely to change, then event sourcing might be a good fit. But still, there are many ways of solving the problems that event sourcing intends to solve. I've yet to be convinced of a single scenario where event sourcing is the most effective tool to solve a problem.

    Relational databases have been around a long time. Modeling with relational databases is a well studied area compared to modeling with event sourcing. The constraints imposed by a relational database is an immense help for developers when trying to model a business domain, and I would say this is greatly underestimated.

    Compare this with the vast open field that is the modeling guidelines for event sourced systems. The guidelines are pretty much non-existing. Practically, you might even have a single event stream for an entire system. There are no boundaries on how much can be taken in by a single domain object in an event sourced system. I've seen similar mistakes happen by experienced developers, and I too have had troubles figuring out where the boundaries are for an event sourced domain model. And I too have seen the problems of migrating events that no longer make as much sense in the business domain.

    I would like to conclude, from my perspective, that event sourcing should be deferred for as long as possible, all other alternatives explored before using it. It has to be determined fully that event sourcing is the only option left.

    Udi Dahan wrote about the pitfalls of this topic almost 10 years ago ( Yet it seems event sourcing is pushed more than ever these days (perhaps most by those that profit from its spread?).

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p