BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

Contract Versioning, Compatibility and Composability

Posted by Kjell-Sverre Jerijærvi, Jean-Jacques Dubray on Dec 09, 2008 |

In the recent weeks, many industry analysts have been prompt to point out fears, uncertainties and doubts about SOA. Gartner, for instance, claims that companies with plans to start SOA initiatives are falling, while companies which plan no SOA initiatives have increased from 6 to 16% in the last 12 months. These notes and articles often sound as if companies no longer believe in building reusable IT assets which can be composed in different solutions.

We believe that the explanation for this lower level of interest in SOA is quite different: one of the key failure of SOA initiatives has been precisely the inability to produce reusable and composable assets. It seems as if that every new consumer brings enough fresh requirements to mandate a service different from existing services. Since services are more expensive to design, build and operate when compared to more traditional solution architectures, SOA cannot realize a large part of its business case when these services can’t be reused.  To our experience, few companies have been able to move beyond primitive reuse of services to achieve the promised effects of SOA.

If you ever hope to reuse a service, it is imperative to have clear design guidelines for contracts that express what this service provides and how it can be consumed. You may be able start your SOA initiative without much SOA Governance, but it would be a mistake not to have contract design guidelines. Overtime, these design guidelines will of course become a central part of your SOA Governance design compliance policies. It seems, however, that the industry as a whole has put a heavy emphasis on SOA Governance processes as a way to successfully build reusable assets, often dismissing the “Just a Bunch of Web Services” (JaBoWS) approach, but this strategy has yet to prove that it is enough to accomplish this particular goal.

We argue that SOA Governance is necessary but cannot generally deliver Service Specifications that are at the level required for reusing assets over long periods of time. SOA Governance is important, and as much governance as practical should be practiced when identifying, specifying and designing a service. However, because of limited resources, time and the inability to predict the future well beyond a 3-6 month horizon, Governance alone cannot create reusable assets. Inevitably, new and unforeseen service consumers will come with new requirements that will mandate an evolution of existing services. Without a proper versioning strategy, new versions of a service will result in a new service altogether, creating the need for similar, yet separate registry entries, code bases and lifecycles. Organizations without a proper versioning strategy often deploy, operate and maintain several “versions” of the same service in production drastically limiting SOA’s benefits and Return-On-Investment fueling comments from some analysts who have been prompt to claim that “reuse” should not be expected as part of a SOA initiative.

In this article, we provide a series of recommended practices for establishing a Service Contract Versioning strategy geared towards service reuse, composability and compatibility with prior consumers (or providers). We claim that such versioning strategy is essential to achieve satisfactory levels of service reuse and in turn generate higher (and expected) ROI from SOA initiatives.

Elements of a contract

“A contract is the most important metadata in SOA”, says Ron Schmelzer, analyst at ZapThink. There are many parts to a contract between a service provider and a service consumer. Some of these elements can be made machine readable, and even for some enforceable at runtime, using technologies such as XML Schema, WSDL and WS-Policy specifications for instance, while other elements will remain mostly for human consumption only.

The goal of this paper is not necessarily to look at every possible elements of a contract from the format and sequence of the bits that travel on the wire, to service level agreements or legal statements (about privacy for instance). Rather, we would like to focus on one simple problem: defining a strategy that supports the evolution of a service provider or consumer while remaining compatible with existing consumers or provider, respectively. We believe that this strategy is core to delivering the benefits of SOA and can greatly reduce both the initial cost of building a service (because it reduces the need for governance) as well as the ongoing service maintenance and operations costs.

The proposed versioning strategy is twofold:

  1. Create machine readable contract elements that express all rules enforced at run-time by both the service consumer and provider
  2. Create or leverage a set of compatibility rules between the versions of the contract elements agreed upon service consumers and providers

Machine readable contract elements are the foundation of versioning because this metadata can and must be used by both the service consumer and provider to decide whether the current message sent to and received at an endpoint will and can legitimately be processed. These elements define the lowest bar for a potentially successful processing of a given message. Of course, business rules in the service implementation can potentially return a contract exception.

One strategy could be to define no machine readable contract element and keep exchanging exceptions each time the consumer or provider cannot process a message, until either the consumer or the provider fixes the problem. For instance, members of the REST community have advocated the adoption of a “uniform contract” dismissing the need for further machine readable elements which could have otherwise provided unambiguous semantics while assisting in the implementation, validation or configuration of a service. Yet, the REST community does not provide evidence that RESTful Web Services can be evolved in a compatible way.

The problem is less about which contract elements one needs, but how to facilitate the expression of compatibility between two versions of a service provider (or consumer). In other words if a consumer interacts with a new version of service provider, are we expecting a behavior equivalent to the prior version or not? Adopting a “uniform contract” such as agreeing simply on sending and receiving messages at a particular location is not going to help answer the compatibility question.

In this article we will focus on the metadata necessary to validate whether an incoming SOAP message is legitimate or not. So we will focus on the contract elements defined in a WSDL document. Since the usage of WSDL 2.0 is not yet widespread, we will be using WSDL 1.1. We do not believe that there would be any significant changes when applying this strategy to WSDL 2.0. The elements of WSDL 1.1 include:

  • Service definition target namespace
  • Message types
  • Message definitions
  • Faults
  • Port types
  • Bindings
  • Service definitions
  • Optional policies expressed using WS-Policies

 

Fig 1. Elements of a WSDL 1.1. Contract

Versioning guidelines

Service and Schema Versioning can lead to these types of compatibility scenarios (Fig 2):

  • No compatibility
  • Forwards Compatible
  • Backwards Compatible

A new version of a contract that continues to support consumers designed to work with the old version of the contract is considered to be Backwards Compatible.

A contract that is designed to support unknown future consumers is considered to be Forwards Compatible.  Such a contract anticipates that consumers will evolve over time by supporting extensibility through XML Schema wildcards. This type of compatibility is often found in B2B scenarios where a consumer works with multiple service providers which may evolve at a slower pace. In the enterprise, the most common scenario is backwards compatibility.

Backwards compatibility is what is typically meant by “compatible” and is possible when non-breaking changes are made to a contract. Making breaking changes to a contract will always lead to no compatibility. These scenarios have been extensively described in the literature by John Evdemon or Dave Orchard for instance.

Note that contracts can be both backwards and forwards compatible (more on this later).

Fig 2. Service Operations Compatibility Scenarios

The degree of support of Web Services technologies for both forwards and backwards scenarios is unmatched in the distributed computing technologies world. Neither RPC, CORBA, DCOM or JEE have been able to support these scenarios out of the box, simply by applying specific design constrains. As a matter of fact the ability to support these scenarios is a key to achieving loose-coupling and reuse.

When a service is evolved in a backwards compatible manner, it can be reused by more consumers, without requiring existing consumers’ implementation or configuration to change. When Governance has failed to predict the needs of future consumers or when the budget prevented to get all the features needed in V+0, compatible service versioning is what enables updates without impacting existing consumers’ operations. In particular, this is how “JaBoWS” can slowly but surely evolve into Enterprise Class services, serving a wide range of consumers.

Compatibility is defined by a stated versioning scheme:

  • Major versions: incompatible (breaking change)
  • Minor versions: compatible (non-breaking change)

Breaking changes are modifications, for instance, to existing schema types that will cause processing of incoming messages to fail, such as changing an existing optional element to be mandatory or adding new required elements. Non-breaking changes are e.g. adding a new optional element. Non-breaking changes to schemas or services are registered as minor versions, while breaking will always be registered as a new major version.

All the versionable artifacts of a service will be affected by changes, and these changes will ripple through the artifacts from bottom to top, and eventually the ripple effect will impact your consumers. This ripple effect is detailed in table 1 and illustrated in figure 3. While versioning lets you control the effects of changes, compatibility helps you alleviate some of the negative effects of versioning.

Fig 3. The ripple effect of changes

In the following, we will state that a service version is incompatible using a major version and a compatible version (forwards or backwards) using a minor version. Point version should be used to denote versions that did not involve contract changes and were limited to implementation, deployment or configuration changes (for instance a bug correction, a new service container release, etc). Point versions are always compatible by definition (again this is a statement not a guaranty).

 

Artifact Change Major Minor Point
Service Breaking X    
Service Non-Breaking   X  
Composite Service Incompatible Schema Service X    
Composite Service Compatible Service   X  
Service Incompatible Schema X    
Service Compatible Schema   X  
Schema Breaking X    
Schema Non-breaking   X  
Schema Aggregate Incompatible Schema X    
Schema Aggregate Compatible Schema   X  
Code Bug fix / maintenance     X
Code Safe modifications   X  
Code Semantic / Unsafe modifications X    
Other service artifacts Any     X

Table 1. The versioning ripple effect of changes to code, schemas and services

Two message types can be stated to be compatible using these simple rules when designing Message Type Schemas:

  • XML Namespace values must be constant for a given message type and a given major version
  • Message type schemas must include mandatory minor and point version custom attribute on the message type root element, using for instance a type xsd:int.
  • Whether XML validation is used or not, each message consumer must verify that the major version of an incoming message is matching its own implementation version. An exception should be returned when the major versions of the message sender and receiver do not match.

In the .Net world, Message Type artifacts are also known as data contracts.

Web Services Extensibility Guidelines

The principles of backwards compatibility have been well documented (see also this reference), yet they are rarely applied. For the sake of clarity we are detailing them here.

Service definition target namespace

WSDL target namespace can be different from Message types XML Schema namespaces. WSDL target namespace must be different each time the service contract or implementation is modified, be it for a point, minor or major version

This namespace is not used at runtime, provided that SOAP actions are defined manually instead of automatically by the runtime. Therefore, all WSDL definitions MUST include manually defined SOAP actions[1]

Message types

Message type schemas must be designed with the utilization of XML Schema extensibility mechanisms (we provide a detailed discussion of these mechanisms below)

Message type namespaces must only refer to a major version of the service contract, indicating an incompatible service version. Minor version may be specified as an attribute to the root element

Message definitions

Parts can potentially be added to message definitions, we however recommend to use a single part which contains a message envelope in which content can be extended using XML Schema’s extensibility rules

Faults

Faults cannot be added to an existing operation, unless they pertain to message type extensions associated to a new minor version

Port types Port types can be extended with new operations

No changes can be made to the operation’s signature nor the order in which the operations are called (which is not specified in WSDL)

Bindings

New bindings can be defined, however, existing bindings cannot be modified, including the endpoint of the service

Service definitions

Services definitions can only be extended with new ports, existing ports cannot be altered

Optional policies expressed using WS-Policies

There are no general framework available to assess compatibility scenarios for policies

Table 2. Service Definition Compatibility Design Rules

The design of Message Types using XML Schema must be carefully planned to achieve backwards compatibility, also known as XSD extensibility.

John Evdemon explains:

The XML Schema standard introduces as a wildcarding element. enables schemas to be extended in a well-defined manner. includes a namespace attribute that either constrains or extends the range of elements that might appear within the wildcard. The namespace attribute can be set to any of the following:

  • ##any enables the use of elements from any Namespace to extend the schema.
  • ##targetnamespace restricts wildcards to the elements that appear within the targetNamespace.
  • ##other makes it illegal to extend the schema using elements from the targetNamespace.

The processContents attribute dictates how schema extensions should be validated by the parser:

  • strict requires the parser to validate all schema extensions.
  • skip turns off validation for schema extensions.
  • lax validates elements from supported namespaces and ignores unknown or unexpected elements (most Web services specifications use lax).

 

Regarding XSD extensibility, there is a snag in the XML Schema’s  1.0 specification. Because of the unique particle attribution (UPA) rule there is an ambiguity during the validation of XML Schema types that have an optional or unbounded number of elements as their last element in the type definition. For that reason, when the last element of the type has a variable cardinality, we have to add a specific element with a cardinality of exactly one, before using the element.

For instance, you may want to choose an element such as this one:

<eovmxmy> where x and y are the major, minor version numbers  (eov stands for end-of-version)

In XML Schema 1.1 this ambiguity will be removed and this additional element will not be needed.

Using XML Schema extensibility features is one thing, but we still need to define and agree on a set of rules when an older version of a message consumer encounters a newer version of a message. It is likely that this new message will have elements that fall in the some of the extensibility sections of the original message schema. The question becomes then, what should we do with these elements?  We recommend applying the following rules which result in Backwards Compatibility requiring no consumer side changes:

  • The behavior of a service when it encounters an unknown element must be clearly defined by an extensibility handling rule as part of the contract
  • New elements added to a new version of a message type designed to be backwards compatible must not invalidate the prior version of the message type
  • Consumers of a message must accept and remove from processing any element that they do not recognize
  • Responses generated from a request that contains the same types as their corresponding request must add the corresponding elements they removed prior to processing the request

These rules ensure that every consumer is in the position to use XML schema validation for the elements that it is bound to process in its implementation. In no way, we recommend using XML and XML schema extensibility in a sloppy way and forego validation (Figure 4). Contrary to a widespread belief in the industry, XML and XML Schema extensibility is essential to a compatibility-based versioning strategy, and hence essential to achieve reuse. In a bizarre fate, this key technological advance has been all too often discounted by industry experts and analysts.

Consumers and providers’ implementations are asymmetric when it comes to their implementations:

  • If the consumer validates the incoming messages sent by the provider, they will pass validation (because of the implementation rules defined)
  • However, -and of course- the service provider implementation must in general keep track of the schemas for all the minor versions for incoming messages and validate each incoming message based on its minor version number for a given major version, even possibly route the call to variants of the implementation

The reason for that is because an upper minor version schema cannot validate a lower version schema (in general). The upper minor version will most likely have required elements in the same target namespace that cannot be validated with a lower minor version schema.

Consumer Variations

There is also a need for “extension areas” in an XML Schema that are different from versioning (Figure 4). For instance, extensions are common when consumer specific variations are needed, i.e. when specialized relationships between consumers and providers (independently of other consumers) are necessary. These extensions can be viewed as a private contract between a particular consumer and the service provider, embedded in the contract common to all consumers. Again, having a strategy to enable these extensions is essential to achieve a good level of reuse of services.

  • Extensions must be treated separately from the general versioning patterns. We recommend implementing extensions using a single element under the root element of the message type.
  • Extensions must not be processed by the message consumer, unless they are explicitly defined in its message type or unless specified with a mustUnderstand=’true’  attribute. The message consumer’s implementation must generate an exception when it cannot process the required elements.
  • It is advised that extensions belong to a different namespace from the root of the message type.

Fig 4. Versioning and User Extension Compatibility

With Web Services technologies, the fact that one may use different endpoints to access the same information via different versions of the business logic simplifies greatly the maintenance and operations of the corresponding services. Overall, Web Services technologies, XML and XML Schema offer unprecedented opportunities to support compatibility scenarios unlike distributed computing technologies before them, including REST which cannot rely on a stable contract to specify a versioning strategy. In REST, the fact that each resource (instance) exposes an endpoint is creating a strong coupling between the endpoint and the resource (instance) which makes it difficult to operationally manage multiple major versions of the implementation of the business logic as REST offers no room to introduce a layer of indirection between the two. This coupling between resource and endpoint makes it difficult to assign a version to a resource type since in REST, resource types do not exist and each resource (instance) can potentially implement its own version of the business logic associated to a particular version of HTTP verb and arbitrary noun.

Compatibility & Composability

The compatibility scenarios defined earlier can be combined into these strategies as defined in the latest Thomas Erl series book Web Service Contract Design and Versioning for SOA:

  • Strict: Any change is considered unsafe and must cause a new version. Non-breaking changes cause a new minor version. Breaking changes will require a new major version. Both backwards and forwards compatibility is intentionally disregarded.
  • Flexible:  Non-breaking changes will only result in a new point version. Breaking changes will of course require a new major version. This strategy uses backwards compatible contracts, but not forwards compatible contracts.
  • Loose: Use both backwards and forwards compatible contracts. Breaking changes will of course require a new major version.

“Strict” is a very common approach to handling schema versioning. It is a safe approach that will give you no surprises when changing contracts. In fact, the schema artifact versioning table shown above is based on the “Strict” strategy. However, it will lead to an explosion of contract versions as your services evolve, and this will hurt discoverability and especially governance not to mention reuse. In addition, operations will need to keep your multitude of service versions up and running. Changes to schemas that are used in aggregate schemas (schema compositions) will ripple through all involved aggregates and cause a domino effect of new schema versions. And of course, when a schema used in a service gets a new version, the service must also get a new version. Composite services that involve these services will then also be affected and must also get a new version.  Thus, the ripple effect is much bigger than you think.

This versioning domino effect soon has caused a little change to ripple through all versionable artifacts in your system. And it doesn’t stop there; in the end it will affect your consumers.

Composability will suffer when you have poor service discoverability and a proliferation of service versions. Which services should an unfortunate consumer use and which services will work together as composite services? Having standardized service contracts including a common information model for your domain might not be enough when there are thousands of versions of the standardized services and schemas.

“Flexible” tries to alleviate the version explosion effect of Strict, by treating all non-breaking changes as safe and backwards compatible. As backwards compatible contract changes by design continues to support consumers designed to work with the old version of the contract, a new version is not needed – i.e. it is just a point version. Services can easily be composed together as all the contracts within a major version are backwards compatible and only one minor service version – the latest – needs to be considered for every major version. Add “Loose” to get forwards compatible contracts, and all your compatibility & composability issues are history.

In theory this seems to be the perfect solution, only breaking changes will require a new version and hence ripple through all versionable artifacts. In theory - had it not been for possible side-effects of “safe” changes, both functional and non-functional. As Nicolai M. Josuttis shows in the book SOA in Practice even adding a new optional XML element can have non-functional side-effects such as increasing the response time of a service, breaking the SLA of the service. It would be safer to provide a new service version with the new schema, as if there is a problem, only the upgraded consumers that required the change will be affected. Keep in mind though, that in our proposed recommendations, the compatibility is stated, not implied, so if the SLA was changed by a single element, this would mandate the definition of a new major version (and XML namespace), even though from an XML extensibility perspective, this change is perfectly compatible.

A certain combination of these will work better to our experience:

  • Flexible/Strict: Use “Flexible” for all safe schema changes, while “Strict” must be used for any unsafe modification to schemas. Breaking changes will require a new major version. This approach supports backwards compatibility.

The “Flexible/Strict” strategy combines the best from the three original strategies. It grades changes into safe and unsafe even if they are theoretically backwards compatible (i.e. non-breaking). Safe changes cause point versions, while unsafe changes cause at least a minor version. There are no absolute classification schemes of which changes are safe and which are unsafe, but we suggest that adding to schemas are considered safe while modifying existing schema components are considered unsafe. Always judge if even a “safe” change may cause negative side-effects, remember to consider non-functional aspects.

Strategy Change Major Minor Point FW
Strict Breaking X      
Strict Non-Breaking   X    
Flexible Breaking X      
Flexible Non-Breaking     X  
Loose Breaking X     X
Loose Non-Breaking      X X
Flexible/Strict Breaking X     +
Flexible/Strict Non-Breaking, Safe     X +
Flexible/Strict Non-Breaking, Unsafe   X   +

Table 3. Compatible changes classified as safe or unsafe

Using “Flexible/Strict” will impact the schema artifact versioning table as all changes that in the “Strict” sense must be a new minor version now can become just a point version for safe changes. Unsafe non-breaking changes must still be a new minor version. Note that even if you judge a code change to be safe, we still recommend this to be a new minor version.

Schemas are validated and routed based on their major and minor versions, thus aiming for point version compatibility will also reduce the need for intricate versioning mechanisms in your services.

We recommend using the “Flexible/Strict” strategy for your published services, and also use forwards compatibility in the form of planned extensibility. Avoid just throwing in schema wildcards everywhere as this will lead to vague contracts, countering discoverability. John Evdemon advises that “schemas should be designed for extensibility, not to avoid versioning”. Judicious use of unambiguous wildcards can help minimize service versioning. We strongly recommend following these guidelines.

Use a combination of compatible contracts and multiple active service versions to ensure that you’re system is flexible enough to accommodate the inevitable changes that will happen over time. Keep things simple. Whatever you do, do not try to implement some implicit automagical handling of versioning inside your services; instead expose your services at abstract endpoints and apply intelligent routing to achieve service virtualization and apply schema duck-typing outside your services.

You should use the flexible strategy during development of the services due to the agility needed in that phase. Wait until the services have been published (are in production) to apply strict versioning aspects, otherwise you will just end up with a very frustrated bunch of consumer and provider developers.

Note that how platforms support backwards compatibility differs. Some require the use of schema wildcards while others like WCF have implicit support for forwards compatibility. Always test that your involved platforms work with your backwards compatible schemas. Don’t rely on this just working, interoperability is important for composability.

Data Model and Message Type DSLs

One of the core problems of distributed computing technologies is the combined handling of both information and business logic. Some approaches are good at managing distributed information (REST / HTTP) and some technologies are good at invoking business logic (Web Services). Technologies that have tried to address both using a combination of remoting technologies and naming & identity services have generally failed at delivering an environment where both information access and business logic invocation coexist harmoniously. Most often, a pattern, such as the Data Transfer Object pattern, has become the prevalent mode of interaction and information representations are simply conveyed back and forth between endpoints representing services, i.e. business logic.

One of the fundamental reasons for the lack of solution for this difficult problem is the fact that Enterprise Data is “relational” and utilizes most often bidirectional associations. By contrast, the Web is built on a “navigational” data model and unidirectional links. In both cases (navigational and relational), there is always the need to fetch data in a denormalized way via the same endpoint. This means that when fetching a purchase order, one might also return some customer specific information (name, address, telephone…) as well as shipping information. Whether you use a RESTful approach or a Web Services approach, the problem is exactly the same. REST is of course slightly better equipped to provide a normalized solution to this problem since a link to the customer can be embedded within a purchase order representation, but this is generally not practical as representations often need to embed related data, to enhance user experience, limit navigations and network roundtrips.

The goal of this paper is not to solve this distributed information integration problem. We are going to assume that somehow people can only deal with it behind endpoints in both REST and Web Services world (this is the state of where we are today until more progress is made in distributed computing technologies). In this environment, the problem people have to deal with is how to manage denormalized message types, especially in the context of versioning, i.e. when the enterprise data model needs a revision (Fig 5.). 

Fig 5.  How do we keep the enterprise data model and message types synchronized?

 Up until this point, the industry has kept Enterprise Data Models, XML Schemas and Message types relatively separate. At most, people defined an ontology with links to both. 

In this article we argue that (Fig 6.)

  • XML Schema should not be used for creating and managing enterprise data models.
  •  Enterprise Data Models should be created and managed based on a DSL (EDM-DSL)
  •  Message types should be created and managed based on a DSL and using the Enterprise Data Model elements as building blocks (MT-DSL)
  •  XML Schemas should be generated from the Message Type DSL (itself referencing elements of the Enterprise Data Model).

Fig. 6. Enterprise Data Model and Message Type DSLs

Let’s explore these points one by one. XML Schema should not be used for creating and managing Enterprise Data Model because XML Schema was never designed for that. XML Schema cannot describe efficiently the relational nature of enterprise data models. XML Schema is hierarchical in nature and cannot model well bidirectional relationships between information entities. XML Schema is a technology that is well suited to describe and validate the structure of self-standing “documents” be it message types, web pages, office documents…

Enterprise Data Models should be created and Managed based on a DSL because no current modeling technology, be it UML or Entity-Relationship diagrams (ERD) have the appropriate semantics to describe Enterprise Data Models. The core semantic that is missing, and that makes it a non starter for any existing technology, is paradoxically having the ability to describe a hierarchical structure of related elements. In other words be it in UML or ERD, we do not have the semantic to define the boundary of a purchase order or a customer. We can define classes or entities, but in reality a purchase order or a customer is effectively “composed” of several classes. We need to be able to define a boundary to these objects and no “standard” modeling language support this semantic because they themselves are too close to the physical implementation model (Object Orientation in the case of UML and RDBMS in the case of ERD). In the case of UML, we would have to define a UML profile to extend it with these semantics. We recommend however to create a dedicated Data Model DSL instead of using a UML profile because transformations are a lot easier, uncluttered by the UML metamodel.

Message types should be defined based on a dedicated message type DSL which references elements of the Enterprise Data Model because, in essence, this is really how Message Types are constructed (or at least should be constructed). When you create a service interface, be it RESTful or Web Services based, you really want to create a message type (or resource representation) that reflects the enterprise data model both semantically and structurally, instead of reflecting the particularities of a given back end system. This is in line with loose coupling practices which recommend to minimize the “contract-to-implementation” coupling. An approach based on the semantics of an Enterprise Data Model is expected to reduce the need for both Governance (or at least reuse Data Governance efforts) and create service interfaces or resource representations that will be most likely reusable and composable.

Finally, flat XML Schemas should be generated from these Message Types definitions (based themselves on the Enterprise Data Model). Flat schemas improve interoperability between Web Service stack as some stack have difficulties to deal with complex nested schema files. XML Schema is a great technology to validate the structure and (some of) the content of “documents”. It should remain the primary choice for describing the structure and validating the content of incoming messages.

Active Service Versions

It is important to implement service lifecycle management policies and procedures to help with governing service versions. We recommended creating an “Active Service Versions” policy to keep the number of service versions that you have to govern to a minimum. Alas, our experience shows that in practice this will be hard to adhere to, thus implement the policy as a guideline rather than a law.

Such a policy typically states that there should be maximum three active versions of a service available for consumers, plus a hidden active minor version of the latest major. The hidden minor version gives current consumers a transition period for reverting to a compatible version, just in case the compatibility statement made for the latest minor version was erroneous. In that case, the consumer will manually point to the previous minor version endpoint.

Fig. 7. Service version classification

You will typically end up with some services that require 5-6 major active service versions, especially for popular services that are used by multiple consumers. There will always be some consumers that are laggards in upgrading to newer service versions, and you might not be able to cut them off. However, the approach that we are recommending ensure that for every major version available, we only need to expose the respective latest minor version.

Ensure that new consumers always by default discover the published version of your services (i.e. the latest major/minor version). The published version is what developers will get using the classic http://url?WSDL “discovery” of the service. In addition, we recommend that three major versions of a service can be discovered in your service registry. These must be the latest minor version of each.

Finally, you must monitor the usage of the services to know which are used and by whom. The service lifecycle stewards must notify all consumers of deprecated versions and gently force them to upgrade to an active service version. When no one (important) is using a deprecated version, you can decommission it. This ends the lifecycle management of that service version. You can easily imagine how much harder the steward’s job would be if he or she had to do the same thing for both the major and multiple minor versions as well.

The active service versions should be accessible through a single virtual endpoint that the consumers use to invoke the service. This will lessen the impact that versioning the service will have on its consumers, as the virtual endpoint stays the same and handles the routing to the correct active service version. Compatibility reduces the need for virtualization as the same endpoint can service several generations of compatible consumers. Services are routed based on their major and minor versions, thus aiming for point version compatibility will also reduce the need for complex virtual endpoint mechanisms.

Service Virtualization is achieved by combining several ESB patterns such as abstract endpoints, intelligent routing and schema transformations. Using service virtualization isolates the consumers from the providers and it processes messages “in-flight” to route and mediate between the providers and versions that make up a composite service that a consumer invokes or triggers through an event.

Using virtualization is not only for composite services, in fact all your published services at any classification level would benefit from this seen from a governance perspective.

Interoperability

The economy is getting more and more globalized and outsourcing seems to be the standard these days. The business processes of a company more often than not involve partners and suppliers mashed together into what is called the extended enterprise

The constituents of an extended enterprise will have heterogeneous systems, just as the different business units of a company most likely will have heterogeneous systems (figure 8). There is no end to the diversity of legacy systems in use in companies today. A recent Gartner study shows that the diversity of platforms used to deliver SOA is growing, even mainframe COBOL have had a surge the last year.

Composability across these heterogeneous service providers in the extended enterprise will require interoperability. This affects how you design your services and schemas, including which of the WS* standards you choose to entail in your services. Our interoperability experience is summarized in the below table.

Artifact Interoperability
SOAP Adhere to WS-I Basic Profile 1.1
SOAP 1.2 Most legacy platforms and tools require SOAP 1.1
WSDL style Prefer “Document/Literal Wrapped” (D/L-W)
WSDL design Avoid wsdl:import and xsd:import as many tools require a single flat WSDL. An example is Flex.
Messages Use message exchange patters (MEP) e.g. as defined in WSDL 2.0
Message design Single part element for both request and response
Schema style Prefer “Venetian Blind” (complex types)
Schema style Do not use anonymous types (“Russian Doll”)
Schema design Support XSD extensibility, use marker element before schema wildcard; always test support for schema wildcards across involved platforms
Schema design D/L-W schema components:  allows only elements,  prohibits use of attributes
Schema design Use of attributes for data or to annotate data elements might not be supported by all platforms
Schema arrays Use wrapped arrays/collections
XSD types Not all XSD data types are supported by all platforms (e.g. xs:date, xs:positiveInteger)
XSD nillable Some nillable schema components might not be supported by a platform, while other components might be required to be nillable
WS* standards The support varies wildly between platforms, always test and verify chosen standard for interoperability
WS-Security Make sure to agree on version, token type, message or transport security, order and level of encryption and signing, and establishment of secure conversations (security sessions)
WS-Policy, WS-SecurityPolicy Some platforms and tools do not support these standard, thus policy metadata have to be communicated out-of-band. An example is Java Spring-WS.

Table 4. Contract design guidelines

 

Fig. 8. Growing diversity in SOA platforms

The figure “Trends in Use of Development Languages are Related to SOA Adoption” is from the Gartner report “2008 SOA User Survey: Adoption Trends and Characteristics” by Daniel Scholler.

Conclusion

In this article we have defined a versioning strategy which focuses on reuse by enabling services to evolve to meet new consumer requirements without breaking existing consumers of the service. This approach adds a new dimension to the reuse of service: in a way it introduces a “forward” reuse strategy, as the “new version” of a service is reused by the older consumers instead of the other way around, while people traditionally think of reuse when a new consumer reuses a service designed for an existing service consumer.

From our experience, we feel that compatible, versioned data models, messages and services have not been a primary concern of SOA initiatives. In addition, of those that defined a versioning strategy, very few have used XML and XML Schema extensibility. It is our strong belief that a compatibility-based versioning strategy can increase service discoverability, composability and true reuse. It can also reduce, albeit not eliminate, the need for service governance. Overall, it is expected that the cost of construction, operation and maintenance of a service will greatly be reduced by such versioning strategy. It is time to move beyond primitive reuse to reap the benefits of your service inventory.

[1] Harmut Wilms, InnoQ, Private Communication

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Schema and WSDL first by David Karr

Good article. I just wanted to emphasize one point along with some comments.

As experienced SOA professionals, you make an underlying assumption about proper SOA development that some people may not clearly see. The assumption is that the schema and WSDL are built "from scratch", and not generated from the business logic code (ignoring for now your thoughts about schema generation from a DSL). This may be obvious to you, but it needs to be emphasized every time you talk about best practices for SOA development, because some people still don't get it.

This notion is related to your thoughts about the mismatch between the EDM and message types. If you generate schema and WSDL from business logic, you end up shoehorning the EDM into message types, which doesn't work.

Perhaps you should elaborate a little more on your thoughts about using a DSL. Nowhere in here do you actually spell out "domain-specific language", or what that means. I don't believe that TLA has yet earned the right of "self-description" :) .

(Note: Figure 2 has "Cosumer" (have I seen this diagram somewhere else?).)

Re: Schema and WSDL first by Christian Schneider

Some time ago I wrote a howto for cxf on using java as an internal dsl for generating wsdls. Of course java is not the perfect dsl for service definition but compared to generating wsdl by hand it is quite good. I also describe why my aproach is contract first although I start from java.

Defining Contract first webservices with wsdl generation from java
cwiki.apache.org/CXF20DOC/defining-contract-fir...

Re: Schema and WSDL first by Jean-Jacques Dubray

Dave:

thanks for you comments. The article makes no assumption on how the contract is defined. We have tried to make as little assumptions as possible on the contract. Our goal was simply to point out that there are certain design guidelines (which could be added to a WSDL generated from code for instance) that make a version of a service "compatible" with existing customers. To the best of my knowledge the only constraints that this approach are imposing on contract generation from code is that the XML Schema namespace be identical for all the minor versions of the same major version and SOAP action be named independently of the WSDL namespace.

Please remember that compatibility is "stated" not "guaranteed" and that compatibility is achieved because there is an agreement a priori on how extensions to the expected schemas or interface definitions will be processed. Many different strategies and tactics may be derived from this approach.

>> If you generate schema and WSDL from business logic, you end up shoehorning
>> the EDM into message types
As a matter of fact, it is recommended that the XML Schemas be entirely generated from some model, otherwise the management (and versioning) of the Schemas can quickly run out of control. You can decide what is the best source of metadata (I suggest a Domain Specific Language) for your EDM, but managing schemas and data models separately is, IMHO, not a good idea.

Re: Schema and WSDL first by Jean-Jacques Dubray

yes absolutely, if it works for your type of services, this is a perfectly valid approach and you can bake in some of the versioning recommendations expressed in this paper.

As pointed in the article, the problem with object models (~UML) is how do they relate to message types. There is a mismatch there that may force you to define classes that represent the lowest common denominator for message types.

Distributed Information Integration Problem as defined above by Yeu Wen Mak

What about the progress made in SDO specification - en.wikipedia.org/wiki/Service_Data_Objects?

Re: Distributed Information Integration Problem as defined above by Jean-Jacques Dubray

To be frank, I have lost touch with the latest development of SDO. It does solve important problems in an SOA (just like DataSets in the .Net world) but I have not seen many people using these technologies outside their respective application stacks (Java & .Net).

Ignore-Unknown with XML Schema? by Hermann Schmidt

Nice to see that someone really got his head around this substantial and widely ignored problem.

"Consumers of a message must accept and remove from processing any element that they do not recognize".

That is a perfectly reasonable requirement (called the "ignore unknown" strategy), however, how would you suggest to implement it? The XML Schema validating parser like Xerces cannot do it for me. How can a consumer detect unrecognized elements in the first place?

Re: Ignore-Unknown with XML Schema? by Kjell-Sverre Jerijærvi

Yes, your parser have to support it, like the .NET WCF XML de/serializers, which ignores elements added to the end of the contract, i.e. elements that it knows nothing about. If you need to roundtrip the ignored data, it can be kept in the ExtensionDataObject provided by WCF. However, some of the WCF XML serializers do not support xs:any, that is why it is important to always test this feature across all involved platforms.

Re: Ignore-Unknown with XML Schema? by Humphrey Bogart

try using lax or skip settings in validation processing to ignore things you don't have knowledge of

Re: Ignore-Unknown with XML Schema? by Hermann Schmidt

Well, the processContents attribute is only defined for xs:any and refers to elements in namespaces other than the target namespace. xs:any does not solve the extension problem fully, as Dave Orchard pointed out in his article.

What if there are new (optional) elements in the target namespace?

Re: Ignore-Unknown with XML Schema? by Hermann Schmidt

I have no experience with .NET technology. As for Java, I did not come across any generic solution, yet.

And then there are the various XML data binding tools, which don't care about it. In my case, SeeBeyond eGate and SUN Java CAPS have no feature to kind of filter the incoming XML before parsing it. They will drop out with unmarshal errors, because both are strictly XML Schema based.
I need to filter it myself with some technology I don't have, yet.

Also, the following XML Schema construct killed them all:
<xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>

That's how Dave Orchard suggested an extension point for all namespaces other than the target namespace.
None of the tools I've mentioned above were able to generate valid binding code with this.

Have I already mentioned that I hate XML Schema? :-)

Re: Ignore-Unknown with XML Schema? by Jean-Jacques Dubray

The article recommends to use an <eov> marker (because of the UPA rule) with a cardinality of exactly one. This is not pretty, but it could be pretty effective to help writing the filtering library.

The filtering is important because extensions can create problems when elements are accessed relatively //telephone could return one element in v1 and 2 elements in v2 (and break your code).

Another approach is to have the provider perform the filtering for you based on the request version. This is similar to the "accepts" concept in REST and having the resource providing the resource representation that can be consumed by the consumer.

The whole goal of the article is to show emphasize the importance of "compatibility" when versioning services. Now, there are multiple ways to achieve compatibility, and I don't think we can pinpoint one that is much better than others.

It seems that in your case a virtual appliance could be implemented to frontend your infrastructure and perform this work.

Do you hate XML Schema itself or the way people misuse XML Schema? At the end of the day, there are not many options to create an "XML Schema" technology. Either you say I don't need "extensibility" or you say "extensibility" solves quite complex problems for me. Do you prefer the alternative of managing 15 versions of the same service because 14 consumers never wanted to change their implementation to deal with a slightly varying schema, that could have been made otherwise compatible?

It is all a question of trade off and design. </eov>

Re: Ignore-Unknown with XML Schema? by Kjell-Sverre Jerijærvi

As Dave Orchard says, extensibility/wildcards and versioning are similiar but not the same. That is why we are recommending a versioning strategy for evolving contracts (major versions), that utilize compatibility when possible (minor versions) to alleviate some of the negative effect of versionig. If you get into an UPA issue, that could simply mean that you need a new major version. We are not implying that you should use wildcards to avoid versioning.

Re: Ignore-Unknown with XML Schema? by Hermann Schmidt

Non-breaking changes are by far the most common case and I really wouldn't like to issue a new version for each little element I add.

I'd rather like to achieve compatibility through an intelligent infrastructure (the virtual appliance that Jean-Jacques mentioned), which would make extending existing entities (Schema types) with new optional attributes as simple as adding a new nullable column to a database table. Not invented, yet. I am looking into it.

It is the same old issue with different technology. It was easy with databases. XML Schema is frustrating me. One reason why I hate it.
The other reason why I hate it is that the finest theoretical concepts are worth nothing when the tools are still not capable of all the features. For how many years do we have XML Schema now? They just don't get it.

==> You need to know and test all the platforms before you design your Schema. Interoperability? Ridiculous. We need something like the WS-I for XML Schema! :-)

Re: Ignore-Unknown with XML Schema? by Kjell-Sverre Jerijærvi

>which would make extending existing entities (Schema types) with new optional attributes as simple as adding a new nullable column to a database table. Not invented, yet.
As I wrote earlier, this is built into .NET WCF data contract handling, lucky me :)

Re: Ignore-Unknown with XML Schema? by Jean-Jacques Dubray

>> You need to know and test all the platforms before you design your Schema. Interoperability? Ridiculous. We need something like the WS-I for XML Schema! :-)

granted, this is definitely an issue, this is also why we recommended to use a Message Type DSL, because you can bake in the interoperability knowledge in all your schemas, in addition to generating flat XML Schemas.

The DSL route is not that hard to achieve using tools like VS/DSL Tools or Eclipse/EMF. I am planning a follow up article on this topic. The huge value there in addition to better interop is the link with the Enterprise Data Model.

Great article by Johan den Haan

Good to see this widely ignored subject explained is such a detailed way. Great work!

Re: Ignore-Unknown with XML Schema? by Hermann Schmidt

I like the DSL route you are suggesting. However, this is so new that there are very few places to "steal" ideas from. Each shop will create its own flavour of a DSL.

Having wrestled with UML2 models and generators based on Eclipse Ecore, I know now that this is not the future of software development.

Anyway, I am looking forward to your follow-up article! I have already posted this article as a must read to colleagues.

Re: Schema and WSDL first by JM Beas


(Note: Figure 2 has "Cosumer" (have I seen this diagram somewhere else?).)


Maybe the diagram comes from an old article in an MSDN blog which title is "Versioning Web Services": blogs.msdn.com/donsmith/pages/VersioningWebServ...

Both articles give lots of insight on this issue. One of the latest Thomas Erl's books is also dedicated to versioning of webservices in a SOA: www.soabooks.com/wsc/default.asp

Re: Schema and WSDL first by Kjell-Sverre Jerijærvi

The diagram most likely comes from Dave Orchard's explanation of backwards/forwards compatible schemas:
Example 2 in www.w3.org/2001/tag/doc/versioning
or
Figure 1 in www.xml.com/pub/a/2004/10/27/extend.html

Anyhow, note the quite confusing use of the term "consumer" for the schema receiver in Dave's articles. Mix this with talking about service compatibility rather than message/schema compatibility, and you'll have a very annoying discussion about forwards compatible "consumers" - don't be surprised if you think the other guy has got forwards backwards :)

kjellsj.blogspot.com/2008/12/contract-compatibi...

Who are Service lifecycle stewards by Rodolfo Dias

Good articles, but i not understand about "service lifecycle stewards", this is a design pattern, software or process.

Thanks in advanced

Re: Who are Service lifecycle stewards by Jean-Jacques Dubray

Rodolfo:

thanks for your comment. The definition of a Steward is "one who administers anything as the agent of another or others". A Service Lifecycle Steward is someone who is going to make sure that the service is managed properly along its lifecycle (identification, design, test, operations, update/versioning, retirement...). You don't necessarily have this type of role with traditional applications as they are usually managed by a team.

DSL tools by Bernd Hofner

Nice article! I would like to know more about DSL for Enterprise Data Modelling and Message Type Definition. We currently start to introduce a (customizable) UML tool to use for an enterprise model and for the definition of (SOA) service interfaces and the message types used by these services. A tool is extension is used to generate XSD/WSDL from the UML.

We already were faced with some of the problems you stated in your article (bidirectional navigability, boundary of class trees) that occur if you try to use UML classes from the "global" enterprise model as part of message parameters for a specific service definition.

Referencing the enterprise model generally doesn't work well. We currently try a "smart copy" approach, that leaves a "trace" between the original enterprise class and the message class. After you didi the copy , the message class is independent from the enterprise class and you can cut off undesired associactions, change multiplicity constrains for the attributes and decide in which direction the used associations a navigable.

How to handle versioning has probably not really been thought through, yet. How to make sensible use of the trace between enterprise class and message class in the light of an independent evolution of enterprise model and service models?

Now if we would want to use a DSL instead of UML, quite a few topics will pop up:
* Which language to use (roll your own)? Is there already something that fit's this problem domain?
* Which tools to use for modelling and visualizing the model? I could imagine to build an enterprise model with a text editor in a DSL. But even for that, tool support for the DSL, like syntax highlighting, auto-completion and syntax-check would be highly desirable. And it wouldn't be easy to communicate a model in such a representation, since people are used to graphical representations and just got used to UML. Yes, you could build your own language and editor with Eclipse ECore and EMF, but to get from the generated "custom" editor to a full-featured tool that plays in the same category as full-grown UML tools probably requires quite a lot of investment. Where's the business case for this?

Re: DSL tools by Kjell-Sverre Jerijærvi

Yes, we're a long way from standardized tooling on this, even the standard bodies still are at a level where messages are just simple composite aggregates of data schemas. There is some thinking around how to solve this like Consumer-Driven Contracts that includes ideas for both projected service interfaces and projected schemas.
msdn.microsoft.com/en-us/library/bb286659.aspx

Tooling is available for Service Virtualization today e.g in run-time SOA governance products. Tooling for "entity virtualization" might be the next step.
www.codeplex.com/servicesengine

Re: DSL tools by Jean-Jacques Dubray

I am going to publish an article next month on this topic. I have given up on defining a UML profile personally and went to two DSLs because the UML transformation tool I used were just too hard to use, or I was too stupid to use them, whichever you prefer.

>> After you didi the copy , the message class is independent from the enterprise class and you can cut
>> off undesired associactions, change multiplicity constrains for the attributes and decide in which
>> direction the used associations a navigable.
I think that I found a simple and elegant solution to that problem without loosing the link to the Enterprise Data Model.

The advantages of this approach are that:
- message types are flat, there is no need to create a complex import/include structure
- other mechanisms can be baked in the message type generation (from its DSL definition), versioning of course, but also identity management, business envelopes (a la OAGIS)... that really helps gaining consistency and avoid that every one be an expert at XML schema designs and remember all possible policies defined for designing a message type schema.

I have used both Eclipse EMF (no graphics) and VS DSL Tools. I found VS DSL tools harder to use but more powerful.

As I mentioned, the problem I had with the UML (profiles) is that the UML metamodel is too complex, way too complex. It makes creating transformations nearly impossible (IMHO). I'd rather use DSLs. I am sure there are some people on the planet that can get it work, but I can't with a limited amount of time.

Excellent Article! by Mark Maxey

I've done quite a bit of research, implementation, and policy creation on versioning. This is the best article on the topic I've read (and I've read a lot). Thank you!

Re: Schema and WSDL first by Dennis Djenfer

Kjell-Sverre,

I suggest that you stick with the same definition of backwards and forwards compatibility as Dave Orchard and others are using. You can't really talk about "Service Backwards Compatibility" or "Service Forwards Compatibility" for the reasons that you've already mentioned yourself. In your figure 2, you solve the exact same problem in both the forwards and backwards scenario. A service is either compatible or not with regard to older versions of the service specification.

If you want to make changes on both the incoming and the outgoing messages for a service, and if you want those changes to be compatible with older service consumers, then you need to fulfill two criterions:

1. The NEW technical specification (XML-schema, WS-policy, WSDL) must be backwards compatible.
2. The OLD technical specification (XML-schema, WS-policy, WSDL) must be forwards compatible.

Re: Schema and WSDL first by Kjell-Sverre Jerijærvi

Yes, you are right, it is easier to talk about just the message sender and receiver - and my definition is the same as Dave Orchard. And you two criterions are the same that I refer to as 'ignore unknown' and 'ignore missing'. Talking about the provider or consumer as the "message consumer" is what might confuse most people, and Dave changed this wording in his latest article to "message processor".

Dave Orchard defines backwards and forwards for schemas, not for service providers - commonly referred to as services. If you look at it only from the service provider side, and ignores how the service consumer handles (response) messages sent by the provider, the definition of 'service compatibility' can be defined by the handling of incoming request messages - hence the terms backwards and forwards are the same as Daves, only for "services".

If you think of the backwards "backwards" in the article as such, several "forwards" was toggled during the publishing process - and I've notified JJD about this lapsus. E.g. "flexible/strict" should of course support forwards compatibility.

I think we need to define both schema compatibility and service provider compatibility - and perhaps consumer compatibility. Let us know what you think.

Re: Schema and WSDL first by Kjell-Sverre Jerijærvi

I've written a longer explanation on service compatibility vs schema compatibility. Note the difference between schema design and schema validation, using forwards schemas is not required at run-time to achieve compatibility. Distributed systems is more than just SOAP, so topics ranges from bilateral contracts to service virtualization, REST and messaging in general:
kjellsj.blogspot.com/2009/02/service-compatibil...

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

29 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT