Article: A Message Type Architecture for SOA

| by Jean-Jacques Dubray on Feb 09, 2009. Estimated reading time: 1 minute |

This article details a new approach to manage Message Type definitions and establishes a strong relationship with between these definitions and the Enterprise Data Model.  

I argue that XML, ERD and UML have not been suited to effectively model the Enterprise Data Model or a large collection of Message Types (not individual message types).

I suggest instead creating two Domain Specific Languages (one for the EDM and one for the Message Types, referencing elements of the EDM).

The EDM DSL follows the Domain Driven Design principles and defines several new semantics:

  • entities and basic entities
  • a scope associated to an entity which contains basic entities
  • an entity-to-entity association different from the entity-to-basic-entity association

The Message Type DSL is using the concept of a "projection" which specifies which elements of the base entity's scope should be "excluded" from the message type and which elements of the associated entities should be "included".

The Message Type DSL is implementing the Business Envelope Pattern from the Open Applications Group and supports weaving different aspects such as versioning during the XML Schema generation. The generated XML Schemas are self-contained, with no import or include elements. They are only referenced by their WSDLs. All the Message Type definitions are managed at the DSL level.

The DSLs were developed with the Eclipse EMF and OpenArchitectureWare's Xtext.


Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Missing words and typos by Gregor Rosenauer

There are some missing words and rough edges in the first part of the article:
These DSLs are also well suited to create a graphical notation.M2 etamodels (OO and ER respectively).). Both models are actually incompatible with the hierarchical nature of XML documents and its associated schema language. the Hyper-Graph Data Model.

Section "An Enterprise Data Model Domain Specific Language":

UML and ERDs have both a strong lineage heavily anchored in concrete[???]

Actually these three types of Data Models can only be reliably transformed via an intermediary data model called[???]

Re: Missing words and typos by Jean-Jacques Dubray

Yes, sorry, I just realized that. This is a bug in FrontPage nad Expression when you copy a selection it changes the content of the selection from time to time. I'll correct them later today.

Idempotency, Safety by Stefan Tilkov

Interesting to see you, of all people, adopt a uniform interface …

You define idempotency as an attribute of the message, independent of the verb. Isn't e.g. a GET or CONFIRM request always idempotent in you model? On a related note: Any particular reason not to include a verb (or a message attribute) that signals that a method is "safe" to enable access without consequences and support caching?

Can you give a small example (with source in your DSL, I mean) of a projection?

REST and RPC by Stefan Tilkov

REST is RPC oriented and not message oriented. A response in REST has no particular semantics except in the case of Atom collections. REST does not make any distinction between a "technical" acknowledgement and a "business" acknowledgement.

Unsurprisingly, I strongly disagree with applying the "RPC" moniker to REST. My guess is that what you are referring to is just one typical characteristic of RPC, which is that it's a request/response model. REST doesn't share the other two main characteristics of RPC, an application/service-specific interface and the idea of hiding networking behind a local programming model.

Re: Idempotency, Safety by Jean-Jacques Dubray

Well, the verbs are UniformInterface | Any, so the uniform interface only supports the "standard" verbs and is an extensible mechanism to standardize as many verbs as you need within your organization. So it is not as uniform as REST for instance.

The DSL here is a proposal, I highly encourage people to include the attributes that they feel are important for a given message. I am not sure idempotency is a property of verb. Of course in REST it makes total sense to associate idempotency to a verb. However, REST is missing the action dimension. When you consider actions it might be hard to always assert that (action,noun) will behave consitently whatever the noun.

If I take the verb "Pay" and in my enterprise I have "Claims", "Bills", "Invoices",... I may want to standardize on the verb Pay but I can only garanty that my payClaim operation is idempotent. For some reason when I implemented my billing service we could not achieve that result.

In SOA, by definition all messages are "safe" :-) since they do not mandate any changes anywhere (remember we are not CRUDing). The receiver of the message is ultimately responsible for deciding what to do with it. The mere existence of "safe" prooves that REST had CRUD in mind when it was designed.

Caching does not work in the enterprise except in very limited cases which are then controled by the Service Provider. The service consumer does not need to know anything about it. I don't know of any instance where a service consumer would want to cache (or someone else to cache) the content and state of a business object (Account, Claim, Bill...) None of the business entities are cacheable.

Figure 7 provides examples of Projection definitions:

projection basicMemberDetailRequest { &entity Member
exclude { address; }

projection basicMemberDetailResponse { &entity Member
exclude { SSN; address; }
include {groupInformationForMember;}

projection groupInformationForMember {&entity Group
exclude { coverages;}

Re: REST and RPC by Jean-Jacques Dubray

After seeing how people use REST in practice, for instance Doug Purdy, explaining how he created a "Hellow World" RESTful "Service" with Oslo by offering a POST /service/helloworld/{string} syntax that returns a "Hello {string}" resource representation (as you can see a "safe" and "idempotent" VERB+NOUN combination), I can safely say that 99% of developers and architects will use REST that way.

The problem is not so much if REST was intended to be RPC or not, the problem is what kind of architecture constraints the REST model provides for people to do the right thing (no RPC and no CRUDing). I don't want to argue about REST much more, but the fact and the matter is, people that use REST are either remoting or CRUDing and we know what kind of connected system you can build with these two approaches.

The remoting people have tried to make WS-* very remotish (as hard as they could), but fortunately, WS-* has intrinsic properties (inherited from the B2B days and ebXML) that let some people use a message oriented approach. WS-* or SCA support:
- bidirectional interfaces
- forwards compatible versioning mechanism
- assembly mechanisms
- orchestrations (and choreographies)
- the ability to subsitute a service provider by another by changing a single endpoint
- federated security
- ...

Re: Idempotency, Safety by Stefan Tilkov

In SOA, by definition all messages are "safe" :-) since they do not mandate any changes anywhere (remember we are not CRUDing). The receiver of the message is ultimately responsible for deciding what to do with it. The mere existence of "safe" prooves that REST had CRUD in mind when it was designed.

You have your logic backwards. In HTTP, there's a guarantee that "safe" methods (OPTIONS, HEAD, GET) can be called without negative consequences for the caller. For this reason, they can be used by a search engine, or to receive metadata, or to get something to present to the user, or to get at the next possible state transitions.

Re: Idempotency, Safety by Jean-Jacques Dubray


I understand why REST needs to make the distinction, but again, this only exist because REST has a very small set of verbs. In practice and in reality you have N+4 verbs, so I don't see any real value to make this distinction. If a method is directly acting on the system of record in REST, you may want to flag it, but again, this is a pattern that is not desired for connected systems. I don't want anyone to manipulate my system of record. As a service provider, I am responsible for making this decision. So you can argue that all (~N+2) messages are safe, or none (~N+2) of the messages are safe. Since you can argue in both direction, it tells me that this attribute is not suited for SOA.

Now, if you have special agents arbitrarily calling methods of a certain type you may want to flag it with whatever attribute you want (calling it "safe" is ok). I would however point out that you don't let agents roam around your systems of records. So, in SOA, there is a "safe" method which is getEmployeeSafeDetails and there is another one that is really not "safe" getEmployeeCompleteDetails (with salaries, ssn,...). Again, I don't see the distinction valid in the enterprise, while I fully understand why it is in REST.

Re: Idempotency, Safety by Jean-Jacques Dubray


incidentally, I have not talked about it but you can design your EDM DSL to mark fields that are sensitive with respect to privacy.

A very important aspect in the enterprise is to figure out if a message contains "private" data or not, so I guess you could also declare a message to be "sensisitive" or not based on the fact if it contains sensitive EDM elements.

This message type architecture is not an end-all be-all, it simply details an approach that can help you manage message type based on any requirement that your organization may have without having to reify your enterprise specific requirements behind very general concepts.

Re: Idempotency, Safety by Stefan Tilkov

I don't understand how you can't see the value of a safe operation; it's the very basis for much of the Web's success. Being able to distinguish between safe and unsafe operations gives a lot of power to the infrastructure, e.g. for indexing purposes. Admittedly, you'd not get much out of it, since you don't do links. So I'll move on to the next topic.

Re: Idempotency, Safety by Stefan Tilkov

Caching does not work in the enterprise except in very limited cases which are then controled by the Service Provider. The service consumer does not need to know anything about it. I don't know of any instance where a service consumer would want to cache (or someone else to cache) the content and state of a business object (Account, Claim, Bill...) None of the business entities are cacheable.

Again, I strongly disagree. Let me point to the most obvious example: An invoice that has already been sent out will quite probably never change again. Why should a client not cache it? The same is true for a claim that has already been processed, or an account that has been closed.

Let me elaborate on the last example: An account that is updated only once per night can be cached all through the day.

We can argue about the relevance or the percentage of cases where this is useful. I claim it's useful in a vast majority of cases. I accept when you disagree and claim it's lower, but not when you say it's zero.

Re: Idempotency, Safety by Jean-Jacques Dubray


I can see the value for the Web, absolutely, it is just that I can't see the value in the enterprise. I understand the values of the links, especially at the presentation layer: if you are a CSR, you need to navigate freely and arbitrarily to get to whatever piece of information is relevant to you conversation with the customer.

But I don't see a service consumer getting an Entity representation and then navigating arbitrarily to links and exploring the content of linked entity representations. I am sure that some enterprise might have a need for something like this, but I don't see why when you get a Purchase Order in your business process you would need to navigate to Customer information. The PO contains the customer information that the business process needs. Nobody wants to do a second call to get a "shipping address", it is expected to be in the PO document.

If you had an example that was not "indexing" or "search engine", that would be great. (Again for the Web, I totally understand the criticality of the concept and why you say this is one of the keystones of the Web success, no question there).

Re: Idempotency, Safety by Jean-Jacques Dubray

yes, I think the key is to consider the frequency at which a particular instance is going to be invoked. How often do you call a company to figure out the state of an order or if you paid your bill? How many times do you need to fetch an order before you set out to pay it?

I can see caching helping in some MDM scenarios, but even in CDI, it will really depend how often a customer interacts with your call center. So I am sorry, again, I don't see a big advantage for caching, but nevertheless all this can be put in the model if you see any benefit, I have no problem with it.

I think what is clear to me today is that:
- every enterprise is different
- technologies cannot reflect that variability
- reifying concepts into whatever a given technology provides does not work, tools are missing, it creates inconsistencies between developers and projects
- today there are great technologies such as Xtext that allow you to pick your prefered technology stack (HTTP+ATOM) or (WS-*+SCA) and layer the semantics that you need. If one day you need to change stack for whatever reason, or god forbids you would actually need to use both stacks to build realistic connected systems, then it is possible.

So for me, these debates are less important today than they were a month ago. All I know is that neither REST or WS-*/SCA can give me the semantics that I need, though I can use all the capabilities that they provide, including caching.

Nice article by Peter Rajsky

It is really interesting article.
We use UML for generating XML schema. Week ago I wrote blog entry about the same problem I feel with our approach.
Combination of UML for definition business entities and message type DSL for "envelopes" will be fine.

Re: Nice article by Jean-Jacques Dubray


thank you for your comments. Yes, it is probably possible to do it with UML. I tried and I failed because of the complexity of the metamodel. One advantage of using a DSL for both is that Xtext knows how to cross reference elements in two different files (I have not implemented that yet). I worked on a project in 2007 with adapative software that used UML and this approach to generate schema. I left the company I was working for before the project was finished but the consultant who continued the work told me that he was able to generate XML Schemas from it, but I think they had to be tampered by integration developers to get the final product.

This article I think provides a solution to the problem you exposed in your post:


Re: Nice article by Peter Rajsky

Yes, this approach really solves my problem, but there are a few minor negatives (except of the first one) probably:

  • (1) Using UML allows us to use EDM entities, service specifications and service operations in the different views of our architecture (e.g. we can define process view using sequence diagrams and activity diagrams using these model elements, we can define logical architecture using component diagrams). This DSL defines only small part of model we need in practice (at the moment).

  • (2) As entities are "regenerated" in new namespace, we cannot use generic transformations, which transform EDM entity to application model entity in many contexts (e.g. EDM entity Subscriber is regenerated in namespace of service FetchSubscriberData and in other namespace of service ActivateSubscriber).

  • (3) There could be namespace chaos, as entity namespace information is lost in message schema. This is very minor problem - I do not expect that EDM contains entities with the same name.

You are right that UML profiles are too weak for defining projections (views). I hope there will be better tool support for PIM-to-PSM transformations, which could be used for this purpose tool probably.
Meanwhile I will store message DSL definition as tagged value :(

Thanks for sharing your ideas.

Re: Nice article by Tiberiu Fustos

Hi JJ & Peter,

I agree - it's a nice article, but it requires some "digesting" and tooling...Just from our experience (SOA approach to deal with legacy silos in a telco): we have done our BOM in UML and we have partitioned it into different domains. Each domain exposes capabilities via services to the other domains. In order to avoid the complexity problem when inferring the message model from the CDM (described in your blog), we allow the messages to have different representations (schemas for each domain) - I assume these are the "projections" or views you are talking about.

Example: the order management domain does not need all the Customer entity attributes and structure (associations) used in the CRM domain. The CRM Domain remains the master of the Customer Entity in the BOM, but the message models are allowed to differ (thus the Order Management programmer only has to worry about the relevant sub-set of his domain).

The tooling - well, we use a widely used UML tool (EA) with a home-made plug-in and some self-defined constraints for generating the XML schemas and service contracts from each message model. The BOM -> Message Model transformation is however a manual process, we only use the tool support for the Message Model -> XSD (WSDL) transformation.

Re: Nice article by Jean-Jacques Dubray

thanks guys,


again, I think it is necessary to walk away from UML, if you need a graphical notation use GMF on top of the Xtext grammar DSL. It creates an ecore metamodel and I believe that Xpand (which transforms a model into XML schema) will simply transform the ECore based metadata into XML schema.

You might also consider creating an XMI file from the xtext model definition. The bottom line is that you don't have to use UML.

>> There could be namespace chaos
I may be wrong but I don't think so, remember now Schemas are generated you don't care to slide and dice your them, there are no imports. The only usage of the namespace is for keep track of the major version of the message type (it's okay to keep a classification in terms of business areas and process areas for instance, but it is not mandatory).

>> the order management domain does not need all the Customer
>> entity attributes and structure (associations) used in the CRM
>> domain.
If you are saying that a PO has some customer data, but when you ask for a customer profile you have a lot more attributes, this is already covered bu the projection mechanism. This is actually the whole goal of the projection mechanism.

Now if you mean that a PO in an order management system looks different from a PO in a CRM system, I would argue that's covered too. The model can expose many Query/Response interactions via different operations. The projection mechanism ensures that each query/response messages are defined by reusing as much as the EDM as possible.

>> The BOM -> Message Model transformation is however a manual
>> process
I think this is where the projection concept is going to help you.

The Message Type to XML is fairly straightforward, you have simply created a UML model of the message type. The holy grail is to integrate the EDM in the chain.

I have added more comments here:

Re: Nice article by Peter Rajsky

I do not understand why you "mix" two (for me) different problems in your article:

  • (1) Message type architecture - modeling of message types

  • (2) "Fight" against EDA :)

I do not see any relation between these two problems. I really like your approach to message type architecture, but I can't accept your approach to EDA and ESB.

  • (A) SOA is not integration, but SOA is not everything... We really need to integrate silos or systems that maintain different views of entities (e.g. billing needs to react to creation of order in CRM). EDM is not related to SOA only, it is important for EDA too. (Btw. EDA is not related to "replication" of data only).

  • (B) Jack van Hoof's statement was really related to EDA (which is complementary to SOA). And you approach to message type architecture can be used for this purpose too!

  • (C) I really believe ESB platforms shall move their focus and support SOA & EDA together. New "ESB" platforms shall support publishing of business events (e.g. CustomerCreated from CRM), reaction to these events (orchestration; e.g. Store CustomerData in Loyalty System) and services (e.g. FetchCustomer on CRM). Because it seems that ESB providers accept your view at the moment. But I am not happy with it :(

I agree REST is related to internet-wide integration. I fan of REST for this purpose, but not in enterprise.

Re: Nice article by Jean-Jacques Dubray


I am not necessarily "fighting" EDA, what I am fighting is more the notion that Gartner introduced years ago that somehow EDA would replace SOA (and WOA too).

My point was to show that:
a) a CIM is not useful, in general (but not in the context of B2B), the service interface is the CIM.
b) there is a difference between SOA and Integration even though they are using the same technologies
c) Events can be very easily integrated in the SOA model and in that case, these are real events. If you use a "pure" EDA approach (which I have seen for instance at SUN in the late 90s), you are forced to "reify" lots of semantics behind a pub/sub mechanism.

In the article's model, you can naturally define an event as the occurence of a state of a particular entity and nothing more. So ultimately this article is really the foundation to unify resources, services and events. I am pragmatic, I think it is best to provide people to do that than telling them how precisely they should do it. Every company is different and as I said earlier, it is best to define the semantics that fit your problem model best. For instance in Healthcare, "privacy" is a key issue, much more than "safe" and "idempotency". In finance, transaction volume is (or was) the problem, privacy is not a major concern (it is but not as spread across the EDM), so there, concepts like idempotency and safe are important. By having very precise semantics at that level you can start weaving consistantly aspects that will make your SOA that much more simpler and efficient.

Thank you by Hermann Schmidt

thank you for this inspiring article! You are addressing so many of the problems I am struggling with. I am currently burning in hell with an UML-based class model (in Enterprise Architect) and a generator built with oAW, which I have to maintain. In the same project we have a humungous XML Schema with "the canonical model". Two big mistakes in a row.

I have suffered from three naive implementations of a canonical model now and I am fed up with it. I'll not watch a fourth false attempt silently.

I will pursue the strategy you are suggesting. I've had some similar ideas in my mind but I never got them nailed down.

Re: Nice article by Tiberiu Fustos

JJ wrote:

The Message Type to XML is fairly straightforward, you have simply created a UML model of the message type. The holy grail is to integrate the EDM in the chain.

This is exactly the point. We have the BOM (EDM, CIM), we have the message types pro domain but it's not traceable. If I get you correctly, with the DSL you still have to define the projections "manually", but you have the whole thing consistent, without creating the scary "company-wide XSD" that Harmann described or losing consistency (what I experienced).
You definitely hit a pain point here, it's worth digging deeper!

Re: Nice article by Jean-Jacques Dubray

Yes, I think we have all been there, I too have struggled with UML and tried to create modular XSDs. The only reason I prefer DSL over UML profiles is that you have a lot more control over the semantics of your model. In theory profiles are very powerful, in practice they are hard to deal with. There are also semantics that are very hard to reprensent in profiles: for instance a "choice" is a great data structure semantics that XSD innovated on, but in UML it's hard to model. A DSL gives you complete flexibility.

The reason I prefer textual over graphical is simply because graphical plugins are not as robust (and easy to use) as the ones from Xtext. Of course If I could manage the EDM graphically I would, but that not easy either with DSL Tools (MS) or EMF.

You guys can email me, and I'll send you the Xtext files gmail / jdubray

Re: Nice article by Kjell-Sverre Jerijærvi

Unfortunately, the term CIM has been reified by parts of the community from being a common representation of business entity objects like "Customer" as defined by Eric Roch ( to include also the event data such as the payload for "CustomerHasMoved" messages - and even the event taxonomy.

Two examples of CIM reification:

"you need to create your common information model. That model must contain not only information entities, but also a notion of what business documents you will communicate with, and what events occur on each document"

"CIM is a completely controlled and totally governed centralized data model that defines the dataflow of an Enterprise Service Bus (ESB)"

I use the same definition as Eric, thus CIM and EDM are similar concepts that models business entities and differs only in scope; the latter encompass all data in all systems of record, while the former is domain-driven and only comprise the parts of the systems of record that pertains to SOA.

Re: Nice article by Kjell-Sverre Jerijærvi

An explanation of CIM based on articles by Mike Rosen and Eric Roch: Common Information Model.
My post also relates CIM to the reckognized SOA design patters "canonical schema" and "schema centralization", as I agree with JJD that the DSL approach has less negative side effects.

Re: Nice article by Jean-Jacques Dubray

Yes, I think it is important avoid making the service interfaces static via a canonical schema or schema centralization. Personally, I am more a "bottom-up" guy when it comes to building the EDM so I am ok with a CIM approach. Again, "static" schemas must be avoided, this is what kills reuse in the enterprise. In B2B of course the problem is different, you want more "staticity" to build very large consumer communities.

Really nice post by Alejandro Raiczyk

It helped me so much.

One question, when you say:

"The message type may only contain projections, i.e. references to entities, basic entities, associations and attributes of the EDM. There is no provision to add elements that may not be part of the EDM. This is a design decision as the EDM is supposed to represent the enterprise data model and all data elements stored in the systems of record are supposed to be traced to an element of the EDM."
You mean that every field in a message should be projected to the EDM. What happens with fields that are not part of the EDM, for example a filter that is not by example like “sendEmailConfirmation” in a transaction message?

Another one, you've defined

projection basicMemberDetailRequest { &entity Member
exclude { address;}

query MemberQBE on basicMemberDetailRequest;
message getMemberBasicInformation {
verb GET;
noun Member;
query MemberQBE;
what kind of xml do you generate from that definition? Something like this?
I would like to see some generated requests/responses, I don't know if this is possible.
Thanks a lot!

Re: Really nice post by Jean-Jacques Dubray


thank you for your kind comments, I just saw your question today. Sorry for the late reply.

>> You mean that every field in a message should be projected to the EDM.
Actually, from the EDM, they are basically a subset of the EDM but not necessarily following the same boundaries of the entities defined in the EDM.

>> for example a filter that is not by example like “sendEmailConfirmation”
Well I would argue that there should be somewhere an attribute of the transaction thare is "emailConfirmationRequired".

>>what kind of xml do you generate from that definition? Something like this?
Yes exactly, in this case (for a QBE) the fields are optional (query by first name, by last name, ...)

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

28 Discuss