Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Udi Dahan on CQRS, DDD and NServiceBus

Udi Dahan on CQRS, DDD and NServiceBus


1. We're here with UdiDahan. Udi, could you tell us a little bit about yourself and what currently keeps you busy?

My name is UdiDahan, I'm known as the "Software simplist", I'm a so-called expert in Enterprise development and service oriented architecture and I help companies from the large to the small solve their big distributed systems problems. That's about it.


2. One of the things you've talked about lately is CQRS. Could you give us a short explanation of the basics of CQRS?

The acronym CQRS stands for Command Query Responsibility Segregation where there has been a lot of discussion over how far does CQRS actually go and what actually breaks down into other patterns. I think one of the important aspects of it is just to understand that from a business perspective, that there is a very large difference in expectation towards how the system behaves in both of those cases and the time of visibility between when data is entered in the system to when it needs to be visible to the same user that entered that information or to other users in the system. It is something that to a large extent a lot of technologists haven't spent a great deal of time thinking about and CQRS is one way of focusing us on asking these questions rather than actually dictating an answer. So I describe more of an approach towards analyzing a given problem domain, rather than being an overly prescriptive solution, whether it's eventually consistent or not, for example.


3. To me there seems to be two common ways of implementing CQRS - one using ORM or similar and the other using event sourcing. Would you use event sourcing on a Greenfield project?

First we have to say maybe a few words about event sourcing as possibly a layer or pattern on top of CQRS, where one of the benefits of event sourcing as an approach as well is the built-in audit log that it creates by storing all state in the system -- not as a snapshot of state but rather as the series of events and their data that actually brought about that state. It allows for having a built-in audit log of how did we get to what the current system state is, and allows us to do all sorts of interesting things like replaying that audit log for other purposes going forward. That's event sourcing as it is. To the question whether I’d use it on a Greenfield project or not kind of leads to the question of are there business requirements that state I need to have that kind of audit log.

And does that audit log need to replace my state or can I use standard ORM techniques for my state while having an audit log off to the side that isn't used for automatic replay and state creation. I'm afraid that the answer is it depends. The question that I'd ask on that project is "Do we need this capability of having this kind of audit log?" What is the cost of implementing the system like this versus with say a standard ORM solution and bringing these numbers to the relevant business stakeholders and saying "Here is what you get if wedo it like this, and here is what it costs. Here is what you get if we do it like this, and here is what it costs. Pick." As an IT person, I don't particularly care which one they pick, as long as they understand what are the ramifications and costs of each of their choices. I'd say that on a greenfield project I would ask the question, most definitely. I would spend the 5-10 minutes doing the analysis and putting these options on the table. Not assuming either way that an ORM is the solution or that event sourcing is the solution, but putting both of them on the table making sure that everybody understands the implications and very happily living with the result of that decision.


4. How is CQRS linked to Domain Driven Design or is it?

Domain Driven Design is a very large body of patterns and pattern language in its own right. CQRS has a certain impact on how domain models are used. Domain models being a very important part of Domain Driven Design, I'd say that that's probably the largest area. The most significant impact that I see CQRS having on the domain model part of a Domain Driven Design is that the domain model will not take part in queries when doing CQRS, rather it's only involved in command processing. As a result of that, all sorts of entity relationships that we may have introduced in the past in our domain model, particularly the one-to-many and the many-to-many relationships between entities, that to a large extent are serving primarily queries, those relationships are no longer needed in the domain model as queries are served outside the domain model. In that respect I can see it as simplifying the domain model and have it more focused on behavior -- how we process commands -- rather than being focused on data, which is to a large extent how we respond to queries.

There is a certain gray area between the two that has to do with things like security and visibility - who can see which field or which part of data that is often described as domain logic but is used to the context of queries. That's an area that I think that the combination of DDD and CQRS is most interesting, not in the answers that it provides, but in the questions that it guides us to --toreally having a very good understanding to the responsibility of a domain model and what actually security requirements are. I'd go one step further and introduce the concept of bounded contexts, that are a very important significant part of any DDD effort. Raise the question: "Are these security requirements or filtering or visibility requirements really in the same bounded context as the domain model that we are currently looking at?" and spending the time understanding that to large extent, personally every time that I've run into those gray areas where something doesn't seem to fit, either cleanly on the command side or the query side it's often an indication of another bounded context that wants to be introduced. I do see the DDD approach and the CQRS approach really one playing off the strengths of the other, and helping guide people to a better choice of a combination of patterns from both worlds, than if we try to just do DDD or to just do CQRS.


5. I expect then that you wouldn’t consider using CQRS without Domain Driven Design?

That's an interesting question. CQRS is, in its focus on commands, the commands that are involved there are meant to be asynchronous from a technical perspective, but to be designed in such a way that they will most often succeed. In order to do that, CQRS comes with a set of techniques like applying a validation component client side before a command gets sent. Like using the information from the query model in order to check for uniqueness -- in order to check for related entity existence even before the command is sent. This leads to commands that after they pass those preliminary checks, will very likely succeed. However, the nature of those commands that succeed is often very behavior-centric, it's not very much in the domain of create, update or delete of a given entity, but a lot more business centric and I'd say domain centric, as well. Seeing as these are the commands that our users are putting into the system, these behaviors are critical to their domain language.

I would definitely expect them to be a part of the DDD ubiquitous language between our groups. I would be very conscientious of not having undifferentiated bags of commands, but to expect to see commands divided up into different bounded context. I think that those parts of DDD will likely come into playing any CQRS effort. That being said, I think that there may be certain commands, maybe not all of them, but certain commands that can be processed without a domain model. In other words, if all the command is doing is adding some additional data that does not require very much validation or business rules -- and we all have commands like that in our system. The fact that one command from CQRS is not processed using the domain model, does not mean that it is not DDD. There is more to DDD than the domain model and a lot of those other parts will strongly influence all of the CQRS effort. I do think that one can look at applying them independently from a purely technical perspective, but when actually looking at the more strategic elements of DDD - ubiquitous language, bounded context, those kinds of things - that those things very much do come into play with CQRS. It's my opinion that trying to do CQRS for the entire system without any bounded context would likely be a very large failure, just as large as if we tried to create a single enterprise domain model. It's very important to know each pattern, what is its maximum scope and to know what other patterns go around the edges to handle that.


6. You also have an open source project called NServiceBus. For .NET developers using WCF, why should I start using NServiceBus?

I don't necessarily want to do a 1-1 comparison of WCF and NServiceBus because these frameworks are designed to solve different problems. WCF was designed to be a replacement of Web Services, Enterprise Services and .NET Remoting and it does that very well. As a result it allows all the communication models that they allow - for example, synchronous request response. The other element that is in WCF that was added later on in its development lifetime was the integration of MSMQ. Often developers when working with WCF for a certain period of time, tend to notice that when moving to MSMQ a lot of the WCF behaviors need to be configured very differently than they were for the original three. So it's a consistent API model, once you put MSMQ under WCF, you must move to a asynchronous messaging contract. You can no longer have methods that return values in your contract, they all have to be void. It's an imperfect abstraction because one-way messaging and synchronous request response are so different from each other. NServiceBus is primarily designed around the concept of one-way messaging. That's all it is designed to do. It is not designed to do synchronous RPC. Those elements are not visible in NServiceBus and part of the goals was to make building systems using one-way messaging as simple and straightforward as possible by setting up APIs that were designed specifically for that. It also includes features like Publish/Subscribe that don't exist in WCF or not at least available out of the box. Developers can build them on their own, but it's hard to do a comparative analysis at that level. The reason that I'd recommend evaluating NServiceBus is because of the durability and reliability behavior of MSMQ and one-way messaging, the fact that when doing asynchronous one-way messaging scalability is often a lot easier to achieve than when doing synchronous request response.

A lot of very challenging non-functional requirements - reliability, availability, scalability fault tolerance - are handled much better by this style of architecture and having a framework that is designed specifically and only for that style of architecture leads to the simplest, easiest way for developers to work. Let alone the Publish/Subscribe functionality that just does not exist in WCF. When looking at architectures like CQRS that have an element of one-way messaging for the sending of commands and have an element of publish of subscribe for synchronizing the data from the command side to the query side, a lot of those capabilities are sorely needed when implementing those architectures. I found that developers are able much more quickly to implement that kind of architecture when using NServiceBus than when using WCF. That being said, the parts of the system, the query part of CQRS, which is inherently synchronous. The query is saying "I need this data now, not some time in the future and I would like to block until it's available" is an area where I don't think that NServiceBus should be used. I think that the synchronous request response behavior of something like WCF would be entirely applicable. In that kind of CQRS architecture I would expect to see NServiceBus used in one part of system and possibly WCF used in another part of the system. It's not an "either/or," absolute, or one framework to rule them all choice, just really understanding the various parts of the architecture and what are the specific needs and choosing the appropriate frameworks for each one.


7. NServiceBus 2.0 was just released. What can you tell us about the new version?

There is some very interesting things that went into 2.0 that make using NServiceBus at a very large scale to be a lot more practical. One thing that was introduced was the generic host process that gives developers a much friendlier programming model for writing Windows Services. Windows Services when they are developed purely in Visual Studio are not very easy to program against and they are very difficult to debug. Hearing that pinpoint from the NServiceBus user community, we said: "OK, let's create a generic host process that can allow developers to debug their code very easily through this host process and get a very simple installation experience by telling the host 'Install yourself as a Windows Service.'" That just makes building these systems in a production friendly manner a lot more repeatable because you just don't have to write a lot of Windows Service code. Another thing that was added in the 2.0 time frame were two pieces of additional NServiceBus framework pieces - the gateway process and the proxy process. These are two processes which enable developers when using NServiceBus to create multi-site geographically distributed systems much easier, without actually changing any of their programming model.

So, you are able to bridge NServiceBus over HTTP from one site to another using the gateway process, just by installing two additional processes at each site and everything else is just standard NServiceBus programming. The proxy process allows for multi-site Publish/Subscribe. This is especially critical for companies that are providing software-as-a-service smart clients that allow their clients to install multiple smart clients on their end while keeping control of the servers at the central company. The issue that was there was simply a problem of in order to support this multi-site publish/subscribe, a port was needed to be open in the firewall for every smart client, which was something that security people just couldn't live with. The proxy process is something that is installed and runs on the client site and serves as a proxy for the actual server requiring only a single port open in the firewall and it handles all the publish/subscribe in that site on behalf of the server. It really enables people that have been using NServiceBus to develop single-site applications, to take it in a very straightforward manner with just installing two additional processes to go global. I’d say that those are really the big three things - the generic host, the gateway and the proxy.


8. Where do you see NServiceBus going from here?

There is a lot of interest in NServiceBus by a large number of companies - the banking industry to a large extent has, those that are developing with .NET, have starting falling in love with it and very much want to see a WebSphere MQ transport, because that's what they use, they are all MQ. That's one of the things that we'd be looking at providing in 2.1, Sonic MQ integration as well, really making it a lot more friendly to integrating with the existing transport technology that is already in organizations. That's coming on the road map. Other things will probably be just more polish, more refinement, better documentation, lower barrier to entry for developers. That's probably the big thing. It's increasing the capacity of developers to, with the same code that they have always written, have a larger and larger impact on more and more systems and more and more integrations into their enterprise. That's probably the immediate future. Extrapolatingbeyond that, I think that I've already started seeing a certain open source interest in creating monitoring tools specifically for NServiceBus. One of the advantages of building a system in an asynchronous manner is that you are able much more quickly to find out what the bottleneck of the system is just by looking at the queues.

There is a lot of people that are interested in creating automated monitoring tooling on top of NServiceBus that can allow administrators to quickly zoom in and say "OK, this is the bottleneck of my system" and then to scale out the number of servers that are working over there. The longer term future that I see in the post 2.1 time frame after we have integrated WebSphere MQ and Sonic MQ, we are already in the process of integrating the Cloud queue providers. Being able to run on top of Amazon and on top of Microsoft Azure as a first step of allowing people to move to the Cloud directly, something that we're going to be doing in 2.1. Once we have this monitoring set up (and it's going to get some time to get that out there) we'll be able to leverage the Cloud's elasticity and have administrators actually just script the system to say "Monitoring system, please find the bottleneck for me - in other words the queue which is the most full. Now activate the provisioning APIs of the Cloud in order to automatically increase the number of servers that are running over there or vice versa, de-provision the areas that don't have nearly as many up to a certain cost threshold." This is further roadmap kind of stuff of what is able and possible to do on top of the standard NServiceBus APIs and approaches to create self-tuning, self-healing, elastic systems on top of the Cloud with the exact same programming model as people have been using for their on-site fairly simple applications. That's probably one of the things that I'm very excited about looking at around the 2.2 timeframe.


9. Talking about open source about a year ago, Microsoft took the initiative to form the CodePlex Foundation. Have you considered NServiceBus as an open source project for that foundation?

I've considered, I've thought about it, but it's one of those things that I just haven't heard very much demand or talk about in the user community, in the customers that are using NServiceBus. I asked around and said "What do you guys think about it?" and ultimately they didn't know what to tell me. In other words, they said "We're not exactly sure which problem it solves for us, as users of your open source project." As a provider, my main concern is to serve that community as best they can. I understand that the needs of the things that we've talked about so far in terms of roadmap of NServiceBus (stuff that they very much want) and currently I haven't seen in the CodePlex Foundation anything that speaks to the needs of that community. When there will be a time when the services that are provided by the CodePlex Foundation do address certain needs in the needs in the community, I'll probably be more proactive in moving NServiceBus into that kind of ecosystem. But so far I just don't know what's going to be doing for my users, so I'm focusing my efforts elsewhere.

Jun 16, 2010