Model Driven Development and Domain Specific Language Best Practices
Model Driven Development has seen an increased interested over the past few years, mainly in association to Metaprogramming, "textual-DSL" and language workbenches like Xtext or MPS. Markus Voelter, who has contributed significantly to these technologies has published an update to his 2008 paper on "MDD and DSL Best Practices". One of his key updates is stunning:
Modular languages are possible, languages can be extended, and the distinction between modeling and programming goes away almost completely. This has far-reaching consequences for how model driven development can and should be approached.
Even though, Markus still thinks that making a DSL "Turing complete" is a possible sign for a DSL on the wrong abstraction level, he explains:
the ability to extend existing languages (such as it is possible with MPS, Spoofax, and to some extent with Xtext2), makes it possible to build domain specific languages as extensions of general-purpose languages. So instead of generating a skeleton from the DSL and then embedding 3GL code into it, one could instead develop a language extension, that inherits for example expressions and/or statements from the general-purpose base language.
He still thinks that it is preferable to build a DSL than extending, say UML. However, with tools like MPS which come with a built-in version of Java that can be extended. So if a DSL is intended to be integrated with Java, then it makes sense to extend generic languages. In particular, he recommends to learn from 3GLs how type systems work and implement similar rules in your DSL using some new capabilities of language workbenches like MPS, Xtext or Spoofax.
When it comes to Viewpoints, Markus has also evolved his position quite a bit. The concept of viewpoints has been a strong belief of our industry based on the credo that a software systems cannot generally be described by a single notation for all relevant aspects. This is generally reinforced by different roles involved at different times in the system construction process.
the advent of real language modularization with tools like Spoofax and MPS [make it possible to represent different viewpoints] as separate language modules. Specifically with MPS where the same model can be projected in different ways (and thereby perhaps showing different subsets/viewpoints) a completely new approach for handling concerns and viewpoints is possible.
Markus emphasizes some of the advantages of using a DSL over a library. He is suggesting that using extension mechanism:
So instead of providing users a way to define libraries, users (or a couple of developers helping them) may develop their own project specific language extensions in a modular way. This has the advantage of adaptive notations and static checks plus tool support.
The definition of what constitutes a "programming language" or "domain specific language" is blurring. Yet, and in spite of so many advances in the tools and in the approaches to Model Driven Development, few organizations are heading in that direction. Why would you / would you not adopt a Model Driven approach to software development and architecture?
The trouble with DSL's is that they are prone to the sort of abuse that all advanced language features are, namely that programmers who are either naive or too smart for the jobs they are doing will find something "interesting" to do with the feature. There is also something to be said for languages that are more "social", in that they scale project teams and organizations well. Walking the line between expressivity and power and economy on the one hand and comprehensibility, consistency, intuitiveness and function within social groups (scaling talent levels well) on the other is actually quite a difficult act. My tendency is to err very strongly towards languages that are dead simple. Allowing people to do meta-programming is something to be approached very carefully and probably restricted in some creative way.
Syntactic extensions have their down side too
I'd like to recommend Barbara Liskov's keynote  where she expresses her concerns about syntactic extensions and how they could impede comprehension.
Language Oriented Programming
Extending language syntax via text processing may be a viable approach for some situations but it is somewhat extreme. One can consider a host of other options before considerings syntax extensions. For example:
1) Custom tags - both .Net and Java allow custom metadata to be attached to classes, methods, etc. This essentially allows one to extend the type sytem (rather than the language). A common use is for modeling XML Schema in code for object-to-xml serialization and validation (e.g JAXB).
2)Abstract Data Types (ADT) with operator overloading
With functional languages, you can easily build mini languages with ADTs - without extending the language itself.
3) Code vs Expression Tree
Some languages allow the code to be treated as data. LISP is famous for that. More recently, in F# (and to a certain extent in C#/LINQ) you can optionally obtain the expression tree of some piece of code - instead of the default compilation output. You can associate your own 'compiler' with the expression tree to do something specific. F# specifically allows any syntax to be embedded within <@@ and @@> tags which you can custom parse.
Monads is way of language extension that provides some features of aspect oriented programming. For example, in F# monads are used to provide a 'language' for asynchronous programming that feels like normal synchronous programming.
5. Pattern Matching
Some languages (ML family e.g.) have powerful pattern matching capabilities that can be exploited for language oriented programming. On top of this F# has something called Active Patterns which even incorporates backtracking. With Active Patterns your 'parsing' code can mimic the BNF notation of a target language. I think Prolog is another language where such an approach can be exploited well.
Re: Language Oriented Programming
thank you for these comments, they are certainly worth thinking about. In the end, Information Technology and Information Systems are about managing state both in scale and in time. "Computing" is just an ancillary activity in an information system. It is unfortunate that all we have to manage state efficiently is computing-oriented paradigms, specially functional programming which then to negate state altogether.
So I don't quite follow the movement of our industry who looks are parallelism as the next big thing -in Computing- but not in information technology.
Our industry has been given programming languages and paradigms that are a total misfit for decades now, I am not quite sure why is that. I think it is time to evolve "programming" paradigm towards information and away from computing.
Furthermore, with the exploding number of APIs available across the Web one can also note that "execution" is replaced by "action", and again, Computing paradigms are merely useless to help manage this transition efficiently.
So sorry, but I think we need far better semantics than any modern general purpose programming language can offer.
Re: Language Oriented Programming
I am probably not advanced enough in my thinking as you are so I don't understand exactly what you mean but I will attempt to answer regardless.
Many attempts in the past to work out certain higher level constructs have not really worked out. For example, UML based MDA and CASE tools have all but fizzled out. DSLs may be a better approach but the jury is still out. I believe that you need a language that is expressive enough to easily build higher level constructs (DSLs) when needed and work at the lower level when needed - because you need both.
Note that I actually did implement UML MDA by writing a transform from (a constrained) UML activity diagram to WS-BPEL (workflow language). It ended up being a rather long java program (2.5K LOC) and was not an experience I would like to repeat.
As for the web and distributed computing I can say two things:
a) Erlang's Actor model is useful here because it allows you to capture interactions as state machines. In F# you can easily do the same via f# Agents and functional state machines (yes you can do state with functions - you don't need the state pattern). I am sure you can easily achieve the same in Scala.
b) For the REST paradigm to actually work out (as Fielding intended) my opinion is that it needs to be combined with semantic web technologies (ontologies/OWL/RDF). This is the last great hope for REST (else you will have to settle for oData). But here again you need capable languages to deal with such a rich information set. SPARQL is a dedicated DSL for extracting information from a RDF graph. But you can also use Prolog (logic programming). In fact you can get embeddable Prolog-as-DSL implementations for say .Net, Java and LISP.
Scala and F# are two multi-paradigm languages (functional; oop; actor – if not logic programming) that I believe can serve us well today. We really need to move off of Java now.
Re: Language Oriented Programming
>> example, UML based MDA and CASE tools have all but fizzled out
yes, for good reason
>> DSLs may be a better approach but the jury is still out.
I'll probably will have to look at Erlang and F# a bit more to answer properly.
My comment in general is that the industry is driven by "Computer Scientists" who spend their time "computing", this is in completely constrast with the needs of the industry which at least 90% Information Oriented.
In the end it looks to me that abstractions layered above computing paradigms where everything is a XXX (object, function, process, procedure, resource, event, service ...) have failed to provide a solution and contributed to the general distrust in our industry's ability to deliver solution rapidly with limited risks. Until I see one working I'll remain doubtful of that approach.
I appreciate the argument that MDA has not gone much further, but it may be time to look for new approaches.
Still not there yet
Without the help of effective DSL engineering tools, the initial effort required for developing a simple DSL would be substantial higher and simply outweigh the benefits of having a DSL over a library. Supporting the reuse of DSLs - by different types of composition techniques such as extension, weaving, etc. - helps in preserving domain knowledge in a higher abstraction level in the form of languages, instead of libraries. Secondary tool support, for example, for the comparison and merging of models, or for the co-evolution of meta-models and models has also positive influence on taking the steps towards development of DSLs.
On the other hand, DSLs can still not be used as freely / conveniently as general purpose programming languages (GPLs), consider, for example, the convenience of advanced debugging facilities, or the portability of Java, or the flexibility of dynamic languages. So the tools still need to be improved so that domain experts can use the modeling environments (both graphical and textual ones) as programmers can use the standard programming environments today. In addition, the engineering of DSLs requires specialized knowledge that is often not available in every organization, and in general, not well-established in the state of the software engineering practice yet. So there is still a lot to do before model-driven software engineering can become a widely accepted technology.