Evolving CQRS and Event Sourced Systems

After talking with people about upgrading CQRS and event sourced systems, Michiel Overeem came to the conclusion that many of those working with event sourced systems lack knowledge and understanding of the challenges, and don’t know how to approach the problem. At the recent DDD Europe 2018 conference in Amsterdam he described how this was a trigger for him to carry out an exploratory research on how to evolve this kind of system.

AFAS, where Overeem is lead software architect, is a software company that has been building ERP systems for 20 years. The reason Overeem started to investigate how to do upgrades of CRQS-based systems is that they are in the process of doing a complete rebuild of their current system. The new system will be based on CQRS and event sourcing, but more importantly they have decided to put all their knowledge in a model and generate the ERP system from the model. One benefit they see with this is to get a separation between business logic and technology. One challenge is how to deal with model changes and upgrades when the system has been running for some time, creating large numbers of events.

To find out how other teams are dealing with upgrades of CQRS and event sourced systems, Overeem and his team have done 24 interviews with people from different business areas and with event volumes from 50,000 up to millions. During the interviews they focused on evolving event schema and discovered five areas: prevention, technique, ruling, privacy and the read models.

To create a basic viewpoint, Overeem emphasizes that there always is an implicit schema in an event sourced system. The knowledge about the schema is embedded in the system, but, in contrast to a relational database, an event store will not enforce this schema; it will happily store the data given to it.

One way for preventing, or minimizing, the need for changes is to take a Domain-Driven Design approach. With events mimicking real world events in the domain you will get less changes because domains don’t change too often. Another approach came from a company that had a committee looking at all event changes and their impact; they sometimes would suggest other changes that would have less of an impact.

The most common technique mentioned was to never change the events or their schema; instead, a new event is introduced for every change. But although many mentioned this practice, Overeem found that almost no one actually used it. It's a very easy technique, but the risk is that your domain will become more fragile. Changes that are domain specific often work well, but introducing versions of events may ruin your domain.

Weak schema was a technique commonly used. If you are using a weak schema when deserializing to an object, you only care about the properties or attributes you are using. As long as the data is available or there is a default value, the schema is fulfilled.

Update is a technique similar to doing an update in a relational database. With this technique you are introducing a new tool that also has the schema information and will update the event store out of band. This is for Overeem a scary technique; if you make a mistake you may destroy the event store. But there are also rewrites that are not destructive; you can for instance rewrite your events with a new storage technique and change from XML to JSON.

Copy-transform was the most used technique. Here you read from one event store and write to a new event store, at the same time transforming the events to a new schema. A slightly different way is to create a new stream of events from an old stream but keep the new stream in the same store. An alternative when using microservices is to create a new microservice with a new event schema, and then copy all events from the old microservice.

Due to data protection rules, like General Data Protection Regulation (GDPR) for the European Economic Area (EEA), privacy is becoming more important. To remove personal data, Overeem found three strategies:

Use transformation techniques to remove data from the event store.
Use separate events, one main event and one event with personal data that is stored separately. The personal data for an individual can then be removed.
Use an encryption approach with different encryption keys for each individual’s events. The key for events for one individual can then be deleted, thus preventing any read of these.

The obvious strategy for read models is to just rebuild them. One disadvantage is that this may take a long time. To overcome this, Overeem notes that it’s important that the projectors can run in parallel. It may also be possible to skip over old events or filter out events on some other property. If possible, he recommends running the projectors to a second store, letting the original store be in use until the new one is ready.

Overeem summarizes by noting that there are lot of options when evolving CQRS and event sourced systems, and the people they interviewed commonly used more than one technique. He emphasizes that you must know what you are doing and understand your own context. There are many easy answers that don’t take the context into account.

Greg Young is currently writing a book about Versioning in an Event Sourced System that is 70% complete.

The planning for DDD Europe 2019 has started but the exact date has not yet been set.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the DDD EU 2018 topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter