Model Driven Engineering (MDE) seems to be (re-)gaining some interest lately. Is it a new wave such as “MDA” (Model Driven Architecture) and “DSL” (Domain Specific Languages), or is it a more profound movement that will reshape the entire software industry as it has always been forecast since the first CASE tools were invented?
The most recent developments in MDE include for instance textual-DSLs (as opposed to visual) and internal-DSLs (as opposed to external).
Model Driven Engineering has been the focus of an intense research effort. Some DSLs like HTML have experienced a world wide success, but they often were designed well outside the boundaries this research effort, in a very pragmatic way. Similarly, metadata is everywhere: as descriptors, attributes and annotations, yet without much theoretical foundation for its relationship to programming models.
As it is too often the case in our industry, camps form and keep arguing at each other across a “mind made” fence without ever looking for a unified approach that would bring together textual and visual, external and internal and most importantly, code and models. As pundits keep “moot pointing” at each other, IT is loosing its battle unable to support the basic needs of the business at a reasonable cost: everything IT does takes months or years and costs million, if not tens of millions.
The goal of this paper is to introduce a unified approach to programming, an approach where models and code coexist and build on each other instead of opposing each other. An approach where the model is the code and the code is the model. An approach that could lead to a renewed prosperity for our industry and that would enable everyone to contribute to its fullest instead of wastefully opposing concepts, technologies and architectures. Finally, the goal of this paper is to show the way towards “Architecture Refactoring” and “Architecture Defactoring”.
The first part of the paper introduces a new (unifying) taxonomy for general purpose languages and models. The second part focuses on the new concept of “cogent-DSL” and the specification of a new M3 layer: the Meta-Architecture-Framework. The last section explores how we could achieve “Architecture Refactoring” as well as “Architecture Defactoring” with this approach.
A New Classification for “Programming Models”
Even though many practitioners, such as Enterprise Solution vendors, have long understood the need for scripting capabilities as a complement to model driven engineering, very few of the Model Driven Engineering approaches, tools and frameworks deal with this problem efficiently, if at all. Back in 2001-2003, I was the Chief Architect of Eigner PLM, a model driven Product Lifecycle Management solution. My role was to re-architect the solution using a more modern, JEE-based, technology platform. The “old” architecture had been designed and built in 1991 and was model driven: forms and data structures were described using metadata. It was actually one of the key differentiator of the company in the 90s. Competitors who had set out to build a framework of objects (in the IBM San Francisco project style) were failing miserably at giving their customers the ability to customize and upgrade these solutions. A Model Driven approach was the key to enable customization and upgrade-ability. Yet, the key to Eigner’s programming model was a proprietary scripting language as Eigner’s metamodel allowed for “user exits” strategically weaved in and around each metamodel element. All the other enterprise solution vendors support some kind of proprietary scripting language (graphical or not) as well since no one has been able to create a DSL that could efficiently describe a solution without the help of these scripting elements. For the remainder of this paper, we will call these scripting elements “execution elements”.
Definition 1: An execution element is an element of the DSL that is expressed in imperative style and can manipulate part or all elements of the metamodel, including creating new instances.
I suggest that we call the DSLs that do not have any execution elements “anemic DSLs”. An example of an execution element in a DSL is the method of a class. In other words, a method is a DSL element with a property that contains the implementation of the method. I propose to call a DSL that has one or more execution elements, a cogent-DSL.
Theorem 1: General Purpose Languages like C, Java or C# are cogent-DSLs
Even though that might appear to be news to some of you, Java is a DSL, so is C#: they are cogent-DSLs. They don’t relate to any particular “domain” but they are nevertheless a DSL. This proposition is consistent with Jean Bézivin’s definition of l-models.
Theorem 2: Real-world solutions can only be constructed with cogent-DSLs
HTML is a cogent-DSL under the cover: HTML, without JavaScript and the hooks it provides to invoke execution elements written in JavaScript, would probably never have had the success it has enjoyed so far.
An example of proprietary scripting language is the Xion language developed by William El Kaim, Olivier Burgard and Pierre-Alain Muller as part of their “Platform Independent Web Application Model”. They explain:
We use UML class diagrams to represent business classes, their attributes, operations and relations. The implementation of methods is specified with the action language Xion… Xion is used as a query language to extract information from the Business Model and as a constraint language to express the various rules, which govern page composition.
Figure 1 introduces the role of the Xion language in the Business Web Application DSL.
Figure 1. Objexion methodology and tool usage summarized
Since Java and HTML(+JavaScript) are both cogent-DSLs, there must also be something that differentiates them as they appear and behave quite differently from each other.
Traditional programming languages like Java and C# can be qualified as being “monadic”, i.e. based on a single concept: the class. Java has one type of execution element and this execution element manipulates instances of classes.
Definition 2: a monadic programming model relies on a single DSL element
HTML, on the other hand, is “polyadic”. Its metamodel is composed of several elements (Form, Control, Table...) with execution elements attached at different levels of the metamodel and constrained to manipulate certain elements of the metamodel (However, JavaScript does not have such constrains).
Definition 3: a polyadic programming model is a programming model that is free of any reification
I propose that we stop using the classification of textual vs visual and internal vs external. This classification is misleading and do not expose the true nature of DSLs. I suggest, rather, that DSLs be classified as along an axis ranging from anemic to cogent. An intermediate variety could be achieved by an anemic DSL (such as HTML) combined with a monadic programming language such as JavaScript. Obviously people should not be encouraged to develop this kind of assemblage but rather specify execution elements where necessary in the DSL.
I also suggest classifying programming languages from monadic to polyadic. Traditionally (and it seems more a tradition than a physical constraint), programming languages have been based on a single concept: a class, a function, or a process. I claim that constructing complex solutions from monadic programming models is a proposition that is just as flawed as using anemic DSLs to create solution models. Monadic programming languages, even though widely popular, are the very reason for the inefficiencies we experience today in system construction as they require the constant use of sophisticated patterns and rules that developers have to master in addition to the countless technological APIs developed to reify architectural elements into the monadic programming model. Most importantly, monadic programming languages expose huge discontinuities at the architecture boundaries as each layer is usually implemented in a different language (SQL, Java, HTML…) generally optimized for its role in the architecture.
Figure 2 features some of the commonly known DSLs and programming language as part of the new proposed classification.
Figure 2. Anemic and Cogent DSLs vs Monadic and Polyadic Programming Models
DSLs like BPEL are interesting because they have long been known as “pure DSLs”, i.e. anemic DSLs. Yet, BPEL describes a programming language: an orchestration-based programming language with an execution element: the orchestration definition. So even though the syntax of this programming language can be described seemingly as an anemic DSL, it is nevertheless a cogent-DSL (see Figure 4 for a textual-DSL definition of an orchestration language).
Definition 4: Metamodel Oriented Programming (MOP) is an approach that uses a cogent-DSL that implements a polyadic programming model to specify solution models.
Object Oriented Programming is a particular case of MOP which uses a monadic programming model based on a class-based cogent-DSL.
Let’s take a quick example to illustrate how MOP differs from OOP. Clemens Vasters wrote in 2005 an “Introduction to Building WCF Services”.
WCF is using a general purpose programming language to implement a cogent-DSL. This approach is also known as being an internal DSL. The WCF’s programming model includes DSL elements such as: ServiceContract, DataContract, MessageContract, OperationContract...
The unfortunate aspect of weaving this DSL on top of the OO metamodel (classes and methods) is that Microsoft's vision of Service Orientation ended up looking a lot like Object Orientation. This is the very problem of internal DSLs. They are bound to a single runtime and usually reify the DSL semantics into the particular runtime semantics especially when it comes to execution elements.
A WCF Service contract looks like this:
[ServiceContract]
public interface IPeople
{
[OperationContract]
public void StorePerson(
[MessageHeader] UpdateBehavior UpdateBehavior,
[MessageBody] Person Person);
}
Now if we consider a (partial) Services metamodel (Figure 3).
Figure 3. SIPeople corresponding Metamodel
The MOP equivalent of this WCF Service Contract could look like:
public service.interface SIPeople
{
public operation void StorePerson( Person person)
throws invalidDataException(person)
{
MEP = In-Out; // Message Exchange Pattern is Request-Response
requires = { // The Data Access Service will validate the
// incoming data and process the change summary
// if this person is already in the data store
das.validate,
das.update};
}
}
With MOP the OO metamodel disappears entirely from your code. On another note, in a polyadic programming model, you need to implement a syntax like service.interface since a DSL can have multiple elements which define an interface.
The other important aspect of a cogent-DSL is that you are no longer writing code in a GPL like C# or Java. This means two things. You are no longer bound by the OO metamodel and reduced, for instance, to only specify In-Out operations: you can also support Out-In operations just as well in the same interface definition, like it should be. It also means that you have no longer access to .Net or Java “infrastructure” APIs: the cogent-DSL expresses the solution model regardless of a particular architecture or technology.
Now if we look at how WCF handles Message Types, it once again relies on Object-Oriented data structures, just like in the case of Service Interface definitions. That means that you cannot easily weave a Message Type Architecture like the one I describe in this article on top of WCF. In MOP, all the DSL elements (entity, message type, service, object…) participate in the same polyadic programming model based on a common cogent-DSL. In the Message Type Architecture article I explain how the concept of “Projections” is used to define the message type from the entity model. In MOP, I could very simply restrict operation signatures to reference “Projection” elements and nothing else. On the other hand, in most OO languages, I am bound to using class definitions which cannot implement the concept of a projection (similar to views in RDBMS). I am entirely bound to programming model that I chose to implement my internal DSL.
Architectural concepts such as the one introduced by JEE, Spring, OSGI and SCA have grown outside the OO programming model while being based on a simple yet powerful metamodel. In reality, for way too long, most developers and architects have been using a Metamodel Oriented Programming approach without making it a conscious design decision, adding metadata here and there as necessary. As a result, most if not all solution models have been diluted in billions of lines of code weaved in proprietary technologies, patterns and APIs. These solution models have been extremely hard to change to serve increased business needs or to follow the natural evolution of architectural components.
We have established that cogent-DSLs have been used unwittingly and inconsistently across our industry. In the next section we are going to introduce a formalism that will help specify the execution element semantics of a cogent-DSL, in a precise and consistent way across any DSL.
A Meta-Architecture-Framework for cogent-DSLs
Today when someone defines “textual-DSLs” with tools such as openArchitectureWare (using xtext), the syntax of the execution elements is left unconstrained and must be defined for each execution element in each DSL. The general assumption made here is that as long as the execution semantics are specified using a context-free grammar, we will be able to create an interpreter or compiler from it. This approach can lead to a great deal of inaccuracies and inconsistencies.
I could not locate in the literature an attempt to formalize the semantics of an execution element within a DSL specification. In its “Introduction to the Theory of Computing”, Michael Sipser explains how context-free languages are used by designers of compilers and interpreters since they are equivalent in expressivity power with “Pushdown Automata”. However, the theory of computing developed in this reference book is not addressing the implications of modern software architectures on programming concepts.
In its seminal talk at OOPSLA 1997, Alan Kay best expresses the perceived implications that architecture will have in the future. He introduces the “Universal Interface Language” that should enable “objects” to better introspect each other’s capabilities. He also introduces “Ma”, a Japanese word that describe what is between objects.
In the “Software Factories” book, Steve Cook and Stuart Kent explore the anatomy of languages (Chapter 8). Again, no or very little effort is made by authors to understand the implications of architecture. They actually provide a sample language “OSL” (Our Simple Language) that “uses a mixture of graphical and textual notation. The graphical notation has text embedded and there is a separate text fragment defining the body of the methods”. The authors continue and explain that “because of its graphical notation, many people would classify it as a modeling language, rather than a programming language”. As we have seen in the previous section, this distinction is artificial. A more meaningful relationship between textual and visual can be inferred from the authors’ discussion on the differences between Contex Free Grammars and Metamodels (using UML as a specification language for instance). They explain that “a metamodel is richer in what it can express and is able to make finer grained distinctions in the definition of a language”. They suggest to “integrate a CFG approach with a metamodeling approach”, but again provide no systematic approach to achieve this particular goal other than a pure syntactic one.
Let’s take an example to better show how execution semantics are embedded in a given DSL. Figure 4 represents an orchestration language written in xtext. As you can see in the Orchestration element definition, we can only guess that we are entering the definition of an execution element by the use of curly-brackets. It is highly inefficient to have to redefine execution semantics all the time. Such an approach would also introduce the risk of making Model Driven Engineering resemble an unstructured collection of (poorly structured) micro-languages which will be hard to learn (because designed by many different people) and implement.
Figure 4. A textual-DSL for an orchestration language (specified using xtext)
Internal DSLs have, by definition, built-in execution semantics. However, they are also constrained by the semantics of the programming environment they are “internal to”, as we have seen in the case of WCF. When you need orchestration-based semantics in Ruby, you are out of luck. In addition, internal languages do not allow you to specify easily what execution elements can and cannot do, once you are in a general purpose programming environment you can manipulate anything you want, and Ruby gives you a fair deal of freedom. With cogent-DSLs you define precisely which elements of the DSL can be manipulated by each execution element. Furthermore, cogent-DSLs create a programming model that does not rely on any technical API. This is key in enabling architecture refactoring: the more a solution model gets engrained in a proprietary API, the harder it will be to change and evolve the architecture. So Internal DSLs, though they are truly an interesting programming concept, will not be able to achieve what cogent-DSLs can do in terms of polyadic programming models and architecture refactoring.
Even though I could not find evidence in the literature of a general execution semantic specification, it is actually possible to describe the structure of the execution elements at the M3 level (Meta-MetaModel). A lot of people run away or chuckle when someone uses the word “Meta” (not to mention M3). Yet, it is quite trivial to understand what an M3 layer is: if we consider the Model of a solution, a Metamodel simply describes the structure of this model since variants of the solution’s problem might be described by a slightly varying solution model that conforms to the same structure, i.e. Metamodel. Similarly, several metamodel across a wide variety of domains might share the same structure; this is the Meta-MetaModel layer (a.k.a M3 since there are 3 Ms in it). The Meta layers usually stop at the M3 level, as this layer is always self-descriptive (it can describe its own structure).
Unfortunately again, most M3 layers offer no help to describe the structure of execution elements. I am not sure who decided that M3 layers should be object oriented (as M2 and M1 layers should be too), but this seemingly simple, yet tragic, decision has driven us to think in terms of anemic DSLs and monadic programming models exclusively (or dyadic at best) .
For instance, figure 12.2 of the MOF specification states that everything MUST-BE-A class (Figure 5).
Figure 5. Essential MOF Classes (from the MOF specification)
I am not arguing that the “class” model is wrong, I am however suggesting that it is incomplete, that when constructing a solution, not everything IS-A class, no matter how powerful this modeling concept can be. What I am arguing is the insidious monadism that tries to forcefully fit everything behind a single concept is wrong. It does not matter if you start from a class, a process, a resource, a service or an event, monadic programming models cannot effectively create solution models. In the physical world, not a single engineered system is made of parts that are all of the same type (legos have never ever been an engineering concept, they are just a toy).
Lemma 1: Industry Meta-MetaModels (M3) are incomplete
I suggest that M3 layers such as MOF and Ecore be evolved into a MetaArchitecture-Framework (MA-F) that supports the specification of execution elements in a systematic way across all programming paradigms such as procedural, orchestration-based, template-based… Contex-Free Grammars are just too inneficient for that purpose.
So, how would a MA-F M3 layer look like? Let’s start with some requirements. First, MA-F’s role is to describe the structure of cogent-DSLs (at the M2 level), hence it must be able to describe the structure of execution elements in addition to the traditional structure of a DSL. Second, one of the key goals of Metamodel Oriented Programming is to create solution models that are independent of the architecture in which the solution will be deployed while recognizing the diversity of architectural elements that participates in the construction of a solution. By achieving this goal, we should also be in the position to enable architecture refactoring and architecture defactoring (going from a patchwork of monadic programming models to a single polyadic programming model specified by a cogent-DSL). Actually the overarching goal of MAF is no less than defining a generic framework for polyadic programming models and support the construction of composite solutions (as today any system's architecture is composite)
With that in mind, let’s look at how people have specified M3 layers. Frédéric Jouault and Jean Bézivin have defined their own M3 layer: KM3 (which stands for Kernel MetaMetaModel). Their paper focuses on the organization of models, not their utilization, in particular they view “KM3 [as …] a Domain Specific Language to define metamodels”. The authors actually do address the question of execution semantics:
A DSL may have an execution semantics definition. This semantics definition is also defined by a transformation model mapping the DDMM onto another DSL having itself an execution semantics or even to a GPL. The firing rules of a Petri net may for example be mapped into a Java code model.
Yet, they do not address the specification of these semantics. They treat them as another set of metadata. The problem with this approach is that it pushes the burden on each metamodel designer to redefine these semantics over and over. It is possible (and critical) to extend M3 layers such as KM3 to enable the definition of these execution semantics with a high degree of precision and consistency across all DSLs.
The definition of MA-F is still a work in progress, but I can introduce here the main concepts at the foundation of its design. First, an execution element has an execution style attribute which can have values such as procedural, orchestration-based, template bases… Of course, there could also be a syntactic attribute that lets you choose your favorite syntax for a given execution style. The value of this attribute would be specified at the M2 level. Ultimately it is important to converge towards fewer execution styles: there is no value in competing there. I understand that this may be wishful thinking as vendors might view it as another opportunity to ignite a new Java vs .Net war but innovation must now move to the DSL’s design not the execution semantics and general execution semantics must get rid of any DSL-like feature. Designing powerful cogent-DSLs is where immense productivity gains reside.
Second, the Meta-Architecture-Framework must be expressive enough to describe how DSL elements are manipulated within the definition of an execution element. Obviously, services, resources and classes behave very differently even though many people have tried to reify each of these concepts behind monadic OO concepts. As a matter of fact, we can suspect that people did that because they needed some execution semantics which could only be provided by using General Purpose Languages (GPL) since cogent-DSLs were not available, even as a concept.
To achieve this purpose, MA-F contains abstract architecture types (AAT) which only differ by their behavior. MAF’s AAT should not be named with a common name; they should use a letter instead (e.g. “C”) and numbers for variants “C1”, “C2”. AATs and their variants should be kept to the lowest number possible. An architectural element typically has a container (such as an OO runtime or a Service Container, a.k.a. an ESB, a Business Rules engine, a domain container for SCA…) and a lifecycle with respect to its container. The container is not described in MAF as the goal of MAF is to create solution models that can be deployed in a variety of containers that all support the same AATs. The key relationship between a container and an architecture element is the lifecycle of this element with respect to the container. We can actually expect that AAT lifecycles should be very stable and supported across various containers. We should also expect containers will focus on what they are supposed to contain rather than supporting DSL level concepts like EJBs for instance. Let’s take some examples of AATs.
For instance, a “C” Type lifecycle would be:
Figure 6. A “C-type” lifecycle
A “D” Type would be:
Figure 7. A “D-type” lifecycle
And an “S” type:
Figure 8. An “S-type” lifecycle
As you can guess, an “S” type matches the lifecycle of a Service, but not just a service. For instance, an SCA assembly (which may or may not be service) or a Spring application context have a similar lifecycle. This is why they are called “Abstract Architecture Types”. I have long debated whether the lifecycle definition should be at the M3 or M2 layer. My conclusion was that you want these lifecycles to be highly stable and reusable elements of the programming model, hence they should be defined at the M3 layer.
At this point you are probably wondering how a lifecycle can play in the programming model. Let’s take a general purpose OO language. This cogent-DSL has a concept called “Class”. Obviously, the stereotype of a Class (i.e. the Abstract Architectural Type) is not without coincidence “C”. The lifecycle allows us to define the core rules of manipulating a particular DSL element. In this case, the execution elements of the DSL are “methods” and of course a method is allowed to create instances of other classes:
void myMethod(A a)
{
B b = new B();
//Do something with b and a
b.~B();
}
The mere fact that a class instance can be created and later released provides the specification for constructors and destructors. We should not be fooled by the anthropomorphism of creation and destruction; they are simply methods that advance the lifecycle of a particular entity. These methods will be the ones implemented by the container. In the cogent-DSL programming model we are simply signaling the container our intent of advancing the state of the lifecycle.
Some DSL element may have an empty lifecycle, in that case, they cannot be manipulated by execution elements. They can only be manipulated by the DSL deployment infrastructure and its elements are defined as pure metadata.
Before I continue on, I am sure many of you would tell me that in most OO languages you don’t explicitly call a destructor. That rule could be specified at the M2 layer as you specify the structure of a particular DSL element (e.g. a Class), you might also overwrite parts of the lifecycle to instruct that the destructor cannot be called directly within an execution element. However, since it is part of the lifecycle, the destructor must be defined (as it will be called automatically by the container). We could also define at the M3 level <
Obviously, a service semantics would look very different and probably only available in an Operations & Management Solution, not in a business solution. The business solution will simply assume that the service is available.
void myMethod(A a)
{
S s = deploy S();
//configure s
s.endpoint = a.endpoint;
//Do something with b and a
s.^start(); // start s
}
REST presents an interesting programming model challenge with PUT, an idempotent action. PUT can both create and update a resource. Idempotency simply means that a given state in the lifecycle of the AAT has a transition to self. That is because a “persistent” data structure makes little conceptual difference between creating and updating a record. For decades, the RDMS community has used the concept of MERGE or UPSERT. In order to support an UPSERT concept, we must create a variant “D1” of the “D” lifecycle:
Figure 9. Variant of the "D" AAT Lifecycle supporting the concept of UPSERT
An execution element supporting UPSERT could be structured as follows
void myMethod(A a)
{
try {S s = merge S(a);
//Do some work
...
//Upsert s
s = merge S(s); // merge s} catch (Exception e) { }
}
Each transition in the lifecycle has a corresponding action which can be bound to operations defined at the M2 level.
The last piece of information we need to specify to fully define an execution element is the DSL elements that it can manipulate. Figure 10 represents a simple DSL composed of 3 elements.
Figure 10. A DSL with an execution element (operation)
Element1 has one execution element defined called “operation”. An Element1 must have at least 2 operations defined and these operations are allowed to manipulate instances of Element3 and only Element3. They are not even allowed to manipulate Elements of their own type!
I want to emphasize that MA-F is compatible with the definition of a Model given by Frédéric Jouault and Jean Bézivin (Defintion 2). In essence execution elements can be defined by adding metadata in the M3 layer. This is far more efficient than pushing the burden of this definition on the DSL designer at the M2 level which will result in defining incomplete, proprietary and inconsistent execution semantics.
As we have seen in this section, MOP with the help of MA-F helps you define polyadic programming models which can efficiently define execution semantics that combine different programming paradigm, different architecture elements and most importantly, let you constrain which element type can be manipulated by a specific execution element. I believe that this capability will open the door to the design of extremely productive, technology independent, architecture friendly programming models.
Architecture Refactoring and Defactoring
In the last 15 years, software architecture has made tremendous progress by exploiting the capabilities offered by distributed computing. As Robin Milner noted in his seminal book on “Communicating and Mobile Systems: the p-Calculus":
Building communicating systems is not a well-establish science, or even a stable craft; we do not have an agreed repertoire of construction for building an expressing interactive systems, in the way that we (more-or-less) have for building sequential computer programs.
But nowadays, most computing involves interaction –and therefore involves systems with components which are concurrently active. Computer Science must therefore rise to the challenge of defining an underlying model.
For the most part, in the last ten years, we have learned, painfully how to build connected systems, in part because of the advances built forward by Robin’s p-Calculus. Because of these major architectural advances, what took a large and extremely skilled team of senior developers and architects can nowadays be built in weeks by a few junior developers. Yet, we are reaching another point of complexity as connected systems become easier to build: they are increasingly harder to maintain and evolve (even from a pure technical infrastructure perspective) as the solution model is trapped in proprietary APIs and spread across several layers.
The problem that Metamodel Oriented Programming is addressing is the creation of solution models that are architecture independent via the use of Abstract Architecture Types (AAT). A MOP solution model contains only the elements of the solution model and nothing else. These elements would then be translated into deployable or executable artifacts. When the architecture elements changes, the translation engines can be rewritten without significant changes to the solution model (if any). This type of platform independence is of course not new. OMG’s Model Driven Architecture had already targeted a similar concept but without the use of execution elements which are essential to produce fully functional Solution Models. More recently, Intentional Software introduced a framework based on similar principles, but again without execution elements.
I am not claiming that Architecture Refactoring can “that easily” be achieved with MOP since translation engines can be quite complex to write, I claim however that they are easier to write than rewriting a large number of solutions that depend on a particular architecture infrastructure. MA-F makes it easierbecause it standardizes and formalizes the semantics of execution elements.
I also claim that MOP and cogent-DSLs can support a new approach: Architecture Defactoring. The Eclipse foundation is hosting the MoDisco project:.
MoDisco (for Model Discovery) is an Eclipse-GMT project for model-driven reverse engineering. The objective is to allow practical extractions of models from legacy systems. Because of the widely different nature and technological heterogeneity of legacy systems, there are several different ways to extract models from such systems. MoDisco proposes a generic and extensible metamodel-driven approach to model discovery.
MoDisco is part of the ModelPlex European project (MODELling solution for comPLEX software systems) and not surprisingly, Jean Bézivin is participating.
Architecture Defactoring seems like a natural progression, once you understand that when people write code they most often have in mind (consciously or unconsciously) a cogent DSL as part of a polyadic programming model. It is of course very hard to achieve when all you have is a monadic programming model. People achieve this by using various patterns (explicit or not) in their code. These patterns can potentially be de-reified into a cogent DSL. MOP should eliminate the need for these patterns. Some new pattern may appear, but the goal of MOP is to program without the need for patterns, this what cogent-DSLs and their resulting polyadic programming model can achieve. Patterns are a product of the intense reification imposed by monadic programming models.
For instance, if we take a look at creational patterns, e.g. the Abstract Factory Pattern, we see that a cogent-DSL should abstract the need of having to use this pattern in the cogent-DSL itself (not in the interpreter or the code generation tool that process the c-DSL into an executable).
If we go back to the 3 element DSL example above, we can see that an Element1 instance has the ability to manipulate Element3 instances. Element3 is stereotyped with a resource lifecycle. This means that an Element1's operation can do this kind of thing (but remember it cannot manipulate Element2s):
void myOperation()
{
Element3 r = new Element3();
...
//do some stuff with it
r.&Element3(); //move r into an archive state
...
//well we can't do much now that it is in an archive state
r.~Element3(); //Delete the resource
}
As you implement translators that target specific architecture stacks, you will necessarily use an Abstract Factory Pattern to implement new Element3(); Yet, in the solution model, all you care is that you need a new element, you just don't know how this will happen. Now, some code might be written in the Element3() constructor or it might be entirely virtual and simply resolved at the "Architecture Factoring" time, using in effect an abstract factory pattern (or not). What MOP gives you is an opportunity to completely separate the business logic that is specific to the solution (and written in the constructor implementation) from what is specific to the architecture (which is using the Abstract Factory pattern for instance) in the translator and the container.
This approach should work just as well for “architecture defactoring”: you could potentially detect Abstract Factory patterns in a given code base and you can de-reify them into a cogent-DSL. One of the most important things to realize in this de-reification process is that DSL elements have a lifecycle and that lifecycle must be de-reified from the code as well. If you don't do that, the de-reification process might not succeed because some intermediary actions (such as "archive" in the example above, noted &Element3()) might not easily fit in the cogent-DSL or even be detectable from the de-reificator.
Other Gof patterns that become obsolete include: Singleton, Builder, Adapter, Bridge (of course), Proxy, Flyweight, Chain of responsibility (very interesting concept for a Cogent-DSL), Command, Interpreter, Mediator, Observer, State, Memento, Visitor (again very interesting to implement with a cogent-DSL). I will write an article on this topic in the coming weeks, with an emphasis on de-reification.
Conclusion
Metamodel Oriented Programming is based on two new concepts: cogent-DSLs and polyadic programming models. These two concepts were themselves introduced by simply extending current modeling architectures at the M3 level with a precise definition of execution element semantics using Abstract Architecture Types, Execution Styles and constraints, at the M2 level, which limit the types of DSL elements they can manipulate.
The proposed approach automates the production of architecture independent context-free languages while focusing the elaboration of models on the building blocks that are the most relevant to the solution model.
These concepts have the potential to yield a new generation of programming languages with a greatly improved productivity as they only focus on the solution model's structure. In addition these programming languages have the potential to support architecture refactoring and architecture de-factoring.
This paper is also a call to action, a call to consciously start designing cogent-DSLs, translators and containers, a call to demonstrate that cogent-DSL can improve productivity by a factor of 50 to 100 over general purpose languages and support architecture refactoring, finally a call for de-factoring billions of lines of code that buried deep within themselves our solution models.
The author would like to thank William El Kaim, Dave West, Subbu Allamaraju, Johan den Haan, Pierre Bonnet for their valuable comments and suggestions and Boris Lublinsky and Mark little for their support.