Staying Safe and Sound Thanks to MDSD
When can a software product be considered truly complete? In most cases, a product's life ends with discontinuation or replacement. Until this final state, nearly every software product undergoes an evolutionary lifetime. Large and long-lasting complex enterprise systems have a particular tendency to become unmaintainable and inflexible over time. This results in development pace stagnation and an increase in reaction time to customer demands.
This article explains how common MDSD approaches can be leveraged to counteract these flaws. It starts with a general introduction of our opponents such as backward incompatibility and upgrade problems and explains why they are not to be lightly dismissed. It then exposes where such nonfunctional aspects are hidden in today's architectures.
Three examples illustrate how MDSD techniques help us to bring those underestimated life cycle aspects under control without loosing flexibility. The examples are taken from real world agile projects in the eHealth industry and explain how the identified best practices could be applied in any other context as well.
The lessons learned are summarized as rules-of-thumb and the article closes with references to useful frameworks and tools that form the foundations on which the presented solutions are built.
In software engineering, model-driven software development (MDSD) has proven itself in the past few years to be more than simply a one-hit wonder. Today, a lot of the promises of MDSD  came true in reality and the number of success stories continues to grow immensely.
This article illustrates how model-driven approaches can be leveraged to even tackle advanced aspects of today's software systems. Keeping in mind MDSD best practices , we focus on software life cycle aspects, as typical non-functional requirements.
Software Life Cycle
Nearly every modern software system faces life cycle problems during its lifetime. The moment a product is shipped or deployed you have to answer questions regarding backward compatibility and possible migration strategies. Often, compatibility concerns are underestimated or completely ignored in development. This attitude will retaliate at the end, when a lot of resources have to be spent to get a product compatible belatedly. Taking it serious may has a direct impact on the evolution of a software product as we will see in the following introductory example.
The Case of java.lang.Cloneable
Java's java.lang.Cloneableinterface demonstrates how backward compatibility limits the evolution of software. The interface, introduced in JDK 1.0, is just a marker interface and does not provide any methods at all. Types which want to support cloning need to implement this interface. Additionally, a proper implementation of java.lang.Object.clone()has to be supplied. Having the clone()method defined inside java.lang.Cloneablewould seem like a more reasonable and natural API. This obvious interface change never happened because it would have broken backward compatibility. The introduction of this new interface method would have led to compile errors for all classes that had previously implemented the interface out in the world. For the complete history of this ancient bug, see . For the more detailed discussion around the about java.lang.Cloneable, please refer to .
Before analyzing existing life cycle issues in our systems in detail, it is important to first come up with a common vocabulary.
A system is considered as backward compatible if:
- it can process interfaces from earlier generation(s) of the system
- it can process data from earlier generation(s) of the system
- service consumers are unaffected by version changes
Modifications of a system could lead to it becoming backward incompatible. Through the use of a dedicated compatibility layer, backward compatibility of a system can be recovered. Such a layer is responsible for providing pre-version views of a system interface. Under these circumstances, well tought-out interface design gets an even higher importance.
The distinction between binary and source compatibility is irrelevant for the rest of this article.  describes the differentiation between the two for the Java language.
An update represents an action performed to improve an existing software installation. Through an update, the feature set of an application is in no way extended. The objective of an update is to provide additional robustness (e.g. by closing security exploits). An update is unproblematic with regard to backward compatibility when both interfaces and data representation stay intact. The OSGi versioning scheme  defines the version classifier pattern major.minor.micro. An update is indicated through an increase of the third digit only, the micro digit.
We regard a new version of, or an addition to an already installed software product we regard as an upgrade. Compared to an update, an upgrade provides a different feature set. Depending on the grade of deviance between old and new version, compatibility issues could arise. Taking OSGi versioning again, a minor version number increase indicates a backward compatible upgrade. Major release number increases identify backward incompatible upgrades.
Migration is the process of moving from one operating environment to another. For software systems, this corresponds with moving towards a higher version of the same software (if we omit that moving to another comparable competitive product is as well a possible migration scenario). If the newer version of a product claims to be backward compatible with a previous version, a migration does not require any additional action for consumers when switching to the new version. In the other case, a migration causes additional effort for adapting to the new environment accordingly. Hence, changes which lead to backward incompatibility affect all clients of a product. The more customers there are that use your product, the more painful a backward incompatible change becomes for all participants. With increasing numbers of dependant customers, the overall cost of a backward incompatible change and subsequent migrations is in most cases not justifiable. So, the backward compatibility considerations constrict the evolution of a software system significantly.
MDSD, the Knight in Shining Armour
If your product already relies on model-driven approaches, then there are a number of possibilities available to mitigate the conflict situations already mentioned. The objective is to keep clients satisfied and unaffected of system changes while still being able to evolve a product without the feeling of having a millstone around your neck during development.
In each modern application we can identify potential troublemakers which inevitably bring us into backward incompatible situations. With MDSD in place, we have an instrument to counteract those forces. The following three examples demonstrate where life cycle aspects are typically concealed in today's software architectures and how model-driven approaches could be leveraged to keep compatibility under control.
Each exposed API states a contract between provider and potential consumers of a programming interface. Replacing an existing API with a newer, but incompatible, version locks out clients who are utilizing the older API. This is not only true for programming interfaces in common languages like Java but is relevant for all forms of exposed interface (e.g. web services).
API incompatibilities could be resolved through a dedicated backward compatibility layer. The software system only exposes the latest version of the API without supporting prior versions out-of-the-box. The backward compatibility layer in turn is deployed in front of the target system and provides the API in all supported versions. For our discussion, it is irrelevant whether this appliance is deployed in the form of a middleware ESB or inside the target application. The backward compatibility layer acts as an API façade and performs message transformations  between versions. With this infrastructure in place, the rules for message transformations have to be defined.
It is at this point that MDSD enters the stage. For each incompatible change of the API, a corresponding message transformation has to be defined. In a vital project, changes of such kind are the rule rather than the exception. So, writing transformations becomes a frequent and mission-critical task. Forgetting a single transformation will lead to corrupt interfaces for prior API versions. If we follow a model-driven approach, the interface definition is expressed in some kind of model. For each version of the software system a corresponding version of the model exists. Changes between two interface versions could easily be recovered by comparing the two corresponding model instances. As a result, you get a so-called diff model which contains all changes made between two versions. Taking this as input, you could produce a change report which details all changes made in a human-readable form. By taking this report as a checklist you have control over all changes that have taken place in the system. Overlooking a change and forgetting to write a corresponding transformation are now relicts of the past from now on. But you can go even further with a diff model. Groups of interface changes follow similar patterns. Adding a new mandatory field to a transfer data type, for example, always results in the same kind of message transformation. So writing transformations for those changes is tedious and repetitive making it perfect for automation. Without model-driven support, developers have to write all transformations manually. However as we already have the whole change information available, message transformation code for changes which follow certain identified patterns could be produced in a model-driven way. This releases developers from repetitive and error-prone tasks, saving time which can be better spent on other challenging topics. Certainly, this approach has its limitations. It is utopian to expect 100% generation of the transformation code. There will always be complex changes in the model which do not fall into an identified pattern category. For those situations, transformations have to be produced manually. So, we end up having both generated and manually written transformations and have to bear in mind the best practices for separating generated and manually written code . But why spending so much effort on staying backward compatible? This question is very valid and has to be answered for every product independently. For historical or legal reasons, you often do not have a choice other than to remain backward compatible. This is especially true for the eHealth domain. Many medical practices which communicate with centralized external systems still run in a very antiquated environment and expect a stable interface to communicate with. Changing the target interface would imply adaptation in each of the medical practices. Investing time to solve this incompatibility conflict on the target side is often more reasonable than migrating all heterogeneous legacy environments out in the field.
Database abstraction through the use of OR-Mappers, such as Hibernate, is both a blessing and a curse. On the one side, it provides developers with the desired abstraction from underlying database technologies. But on the other side, it hides derailments of schemas between versions. Each change in the persistence model requires rework in database schemas for already deployed applications. For example, adding a new persistent field to a domain class as well requires to attach an additional column in the corresponding database table. If the persistence model and database schemas diverge and are not kept in sync, we speak of a schema derailment.
Again, MDSD bails us out. If we treat the persistence model as first class artefact and produce the underlying database schema from it, we could reuse the model information for compatibility reasons. Having the model in different versions at hand, we can again produce a diff model which provides us useful information regarding the changes between versions. As before, we can then generate a change report to detail all changes that have been carried out. Additionally, we are able to produce SQL upgrade scripts generatively based on the diff model. Again, a 100% generation of the scripts is neither intended nor realistic. Generated scripts always need to be supplemented with manually written scripts which cover more complex schema deviance cases. The combination of generated and manually written artefacts inside a version interval defines the migration path for a database schema to be upgraded to its next version. Since various SQL dialects exist, it is important to provide upgrade scripts for all supported databases. If multiple databases are supported, the scripts need to comply different dialects. With Ruby migrations , the Ruby community provides a database-agnostic DSL to describe schema changes. Those schema change descriptions are then translated into SQL statements for various dialects. Instead of producing SQL directly, the model-driven approach could transform the diff model into Ruby migrations expressions. Doing so, the responsibility of producing SQL statements is shifted on to Ruby migrations. Through the support provided by MDSD, the dread associated with touching and changing the domain model vanishes. Developers no longer hesitate to evolve the domain model and development projects regain their agility.
When using model-driven techniques, we typically use domain-specific languages (DSLs) to express our models. Such DSLs are perfectly suited to describe the target domain because they provide us with necessary domain-specific expressions and abstractions. Compared with general purpose languages (GPLs), DSLs are not as solid as GPLs and are subject to change. As the underlying domain and meta model evolves, and additional concepts and constructs are added to the DSL or replace older ones, the language itself evolves too. Unfortunately, domain-specific languages themselves are not immune to the incompatibility disease.
But there are ways and means to absorb potential conflicts as they arise. This time, we harness the Anticorruption Layer pattern of the Domain-Driven Design camp . As the name states, this pattern claims to solve corruption. In a language change scenario, the corruption occurs the moment we change the meta model of a language. All models satisfying the old meta model are regarded as corrupted since they do not follow the new meta model version. So, the legacy meta model is no longer supported and clients are forced to migrate to the language based on the new meta model. Through a dedicated Anticorruption Layer, we are able to both preserve old meta models and drive the evolution towards a more sophisticated meta model. The Anticorruption Layer is a combination of façades, adapters and translators and is in charge of translating between different model representations. As well as the meta model versions visible to clients, we also maintain an internal meta model. This internal meta model is the primary meta model used for further processing. All external model instances get translated into model instances conforming to the internal meta model. So, external meta models provide a certain view of the target domain. Through the Anticorruption Layer, we have an instrument to map those multiple views to an ubiquitous meta model. A controlled meta model deprecation strategy is easy to implement if version translations inside the Anticorruption Layer are clearly isolated.
A simple example to show the benefits of the Anticorruption Layer: Suppose, you have your own DSL to organize your luxury car garage capacities worldwide. In version 1, you specify width, height and length for each car. After a few months, you make out that your garages only have parking lots of three sizes S, M and L. So, the new version stores size instead of dimensions. Instead of rolling out the new version at once for each garage, you implement an Anticorruption Layer which still understands the initial meta model with dimensions but internally translates to the meta model using parking lot sizes.
Rules of Thumb
As you have seen in the examples, we are faced with incompatibility in a number of different flavors. This article concludes with a set of important rules to keep in mind when dealing with compatibility paired with MDSD:
Backward compatibility is user-friendly
Sophisticated and popular applications promise growth of customer numbers. The larger your clientele becomes the more requirements you will be confronted with. So, successful applications tend to be developed further to produce subsequent versions. The past has provided us with enough examples of disappointed customers struggling with incompatibilities. Being backward compatible to a very high degree satisfies and binds customers to your solutions. Happy customers are the key driver of long-lasting systems.
Backward compatibility is expensive
Compatibility does not come for free. Often, life cycle aspects are sleeping giants which architects do not want to wake up to. Once they are awakened, typically very late in development, they often require tremendous effort to bring them under control. Through model-driven techniques, compatibility aspects can be assessed and integrated right from the beginning. So, compatibility concerns do no longer limit the evolution of successful products. Use incremental migration to keep costs low and complexity manageable: only provide linear migration paths between succeeding versions instead of providing migration paths between all supported versions. Through chaining of migrations you achieve the same goal.
Nobody is immune to incompatibilities
Compatibility has to be considered in nearly every product in industry. For internal products only, usage and compatibility claims could possibly be regulated. All other products aligned for a broader audience need to avoid navigating towards a dead end. Projects need to be adaptive by staying backward compatible and vital for emerging into the desired direction. Only then can products have a possibility to survive and become as old as the hills. So, never treat compatibility as ab afterthought. This is especially true for products developed publicly. As a matter of fact, Open Source developers never know how and where their products might be applied.
Once released, it's legacy
A release version number only represents a particular snapshot of a system's lifetime. Once, a certain version is released, development has already advanced. So, the release represents a development state of the past. Because the release is already shipped, you have to support it in order to keep customers of the released version satisfied.
Model deltas are again models
The first two examples emphasized that model deltas again hold valuable information that could be used for further processing. Harvesting diff models from existing model versions requires only little effort if a model-driven approach is already applied. The existing tool chain needs to be extended with additional steps to produce and process the diff model. The environment (i.e. version control system) could be leveraged to maintain existing model versions.
Anticorruption Layer as best practice
With the Anticorruption Layer we have a powerful pattern at hand to stay both compatible and flexible. It allows for clear isolation of inside and outside representations while keeping a strong nexus between them at the same time. Its introduction into existing tool chains is minimal-invasive and could be performed a posteriori. Once integrated, it allows for both ends to be evolved independently.
Useful Frameworks and Tools
Looking at MDSD frameworks in Java, openArchitectureWare (oAW) , a "tool for building MDSD/MDA tools" is most dominant. It is a sub project of the Eclipse Modeling Project (EMP) and provides robust solutions for model-driven development. In the EMP, the Eclipse Modeling Framework (EMF)  is the major building block and provides sophisticated support around structured data models. Since oAW uses EMF technologies heavily, we recommend using both in combination. EMF Compare  is an implementation for comparing EMF models which seems to be perfectly suited for our needs of creating valuable diff models.
About the Author
Andreas Kaltenbach is Software Developer and Trainer at InterComponentWare AG (ICW) , a specialist for healthcare applications based in Walldorf/Germany. He is focused on MDSD and security in the context of ICW's eHealth Framework (eHF) and beyond. In addition, he is one of ICW's representative developers for the Open eHealth Foundation  which is focused on establishing an open source community for the healthcare industry.
Links and Literature
 Völter, Stahl: Model-Driven Software Development (2006)
 Efftinge, Friese, Köhnlein: Best Practices for Model-Driven Software Development (2008) http://www.infoq.com/articles/model-driven-dev-best-practices
 Cloneable doesn't define .clone, http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4098033
 Bloch: Effective Java: A Programming Language Guide (2008), item 11
 Gosling et al.: The Java Language Specification, Third Edition (2005), chapter 13
 OSGi Alliance: OSGi Service Platform Core Specification (2007), chapter 3.6.2
 Hohpe: Enterprise Integration Patterns (2003), chapter 8
 Understanding Migrations, http://wiki.rubyonrails.org/rails/pages/understandingmigrations
 Eric Evans: Domain-Driven Design: Tackling Complexity in the Heart of Software (2003), chapter 14
 openArchitectureWare, http://openarchitectureware.com/
 Eclipse Modeling Framework, http://www.eclipse.org/modeling/emf/
 EMF Compare, http://wiki.eclipse.org/index.php/EMF_Compare
 InterComponentWare AG, http://www.icw-global.com/
 Open eHealth Foundation, http://www.openehealth.org
COPE framework - Coupled Evolution of Metamodels and Models
Re: Diff Model
I strongly believe that a "Change DSL" where you explicitly describe the changes to the model is the key here. The diff model can then be used as a validation that your change description is complete and that you can derive the new from the previous model.
From the "Change DSL" (can be OAW xText based) written transition the backward compatibility and database scripts can then be easily derived.
Re: Diff Model
1. We need a tool to estimate the diff between two models. With EMF Compare, this already exists. For sure, EMF Compare has its limitations - in the worst case it assumes that there is no similarity at all between two models. A robust solution has to absorb this case and has to react accordingly.
2. With the diff at hand we can do whatever we want. I highlighted two possible approaches (API compatibility & Schema Derailment) in the article.
Yes Karsten, it is important to confess that such solutions will never release developers from thinking about compatibility concerns of their products. A 100% generation of backwards compatibility artefacts is neither desired nor possible in my opinion. The sum of compatibility artefacts consists of generated and manually written code. I tried to indicate this in the pictures by using blue for generated and grey for manually written artefacts. For the manual part, such a "Change DSL" you mentioned seems to be the perfect match ;-)
The diff tooling has two strong benefits:
First, it provides guidance and support for developers. It is able to detect deviances between model versions. Even the availability of system change information helps to make developers aware of incompatibilities.
Second, such tooling covers changes which happen often and follow a certain pattern (e.g. the typical renaming of model elements). For such recurring patterns it is worth writing code once that produces the needed artefacts. Once, you described which artefacts should be derived for a certain change pattern, you could reuse it whenever a change, following the same pattern, is detected.