BT

Building Domain Specific Languages on the CLR

Posted by Ayende Rahien and Oren Eini on Apr 21, 2008 |

Domain Specific Languages is a topic that has grown significantly in popularity recently. This can probably be tracked to the Rails phenomenon. The popularity of Rails, and the extensive use of Domain Specific Languages (DSL from now) in Rails, has sparked wide spread interest in DSLs.

Up until recently, developers had the impression that in order to build a DSL you need to have expertise in Compiler Theory, understand the inner workings of Lex and Yacc and in general be ready to invest a significant amount of time in building the DSL. As a result, very few people ever made an attempt, and when they did, they went the build-your-own-language-from-scratch route.

That tends to be costly.

At the same time, advocates of dynamic languages were able to utilize the dynamic nature of their favorite language in order to build Domain Specific Languages without much trouble at all. In fact, for many of them, this approach is the norm for developing any application of significant complexity.

The difference between the two approaches is significant. The first approach, building your own language, is called an external DSL. This is an expensive project to undertake, as you need to build everything from scratch, taking into account operator precedence rules, the runtime library, executing the code, error handling, and I/O. The second approach, using a host language and modifying it, is called an internal DSL. Those are much easier to build and maintain. You merely have to worry about your own modifications. All the other stuff (which you generally don't really care about) is already handled by the host language.

Another approach is to build a fluent interface and call it a DSL. I do not consider this a DSL, but some others do. This approach is often taken on languages where the syntactic freedom is severely limited. Java and C# are good examples of such languages, and I include Java 6 and C# 3). You can make some inroads into more language oriented APIs, but not nearly as far as to allow me to consider this a DSL.

My preference, in almost all cases, is to go with an internal DSL based on a language with high syntactic flexibility. Since I tend to work on the CLR almost exclusively, I want to use a host language that runs on this platform. It allows me to reuse most of my hard earned knowledge of the CLR. Do not underestimate this benefit. Having a familiar environment at your hands is of tremendous importance.

Before delving into the language, what exactly is a "high syntactic flexibility language", anyway? What features does a language need in order to provide a good hosting environment for an internal DSL?

I need to have suitable means to express my intention. This can be accomplished by intent revealing names, expressing domain specific concepts, and in general, moving away from the generic programming language approach. You want to be able to create a 4th generation language, and do it easily. Let us take a simple DSL that we use to script working with spread sheets, shall we?

Your task, if you please, will be to create the multiplication grid.

for x in range(100):
for y in range(100):
cell[ x+1 , y+1 ] = x * y
formula x, 100, sum( x1, x100 )

This is not really impressive, is it? It looks almost exactly like a programming language, and the code is trivial. Except that trying to do the same using the excel automation API is going to be anything but simple and trivial.

Notice that the code that we have here is all the code there is. We don't need a class definition, or a main method. It is a DSL script that can be directly executed without any syntactic baggage.

If the previous example didn't impress you much, how about defining business rules for discounts on orders:

apply_discount_of 5.percent:
when order.Total > 1000 and customer.IsPreferred
when order.Total > 10000

suggest_registered_to_preferred:
when order.Total > 100 and not customer.IsPreferred

This looks a lot less like a programming language, it looks like the way a business analyst would define the rules in a word document.

From my point of view, both examples are Domain Specific Languages. They simply have different styles and approaches for expressing their domain. In both, we have actually removed from the language everything that is not directly related to our domain. This allows us to stick with the domain, and hopefully have good tools to deal with it.

The removal of anything but the domain concepts is as important as having a syntax that matches the domain.

When we start to look at languages with high syntactic flexibility on the CLR, we have a wide variety of options; let us evaluate a few of them. We will start with those that come from Microsoft.

C# - This is a very rigid language, type definition, no standalone methods / code, inflexible syntax. All those traits conspire to make C# a bad choice for a DSL host language. Oh, it can be done, but it wouldn't be as good as the other approaches.

VB.Net - VB.Net is actually far more appropriate for language oriented programming, because it uses many English words as keywords and operators. Unfortunately, it is also a very verbose language, and we want to reduce verbosity to just our domain concepts.

JScript - This may cause a laugh, but JScript is a very flexible language, which offers a fairly good syntax for many things. Just take a look at all the JavaScript libraries out there. JScript offers the same facilities as JavaScript, after all. And one only has to look at things like JQuery or Prototype to understand how flexible you can make it. However, it is no longer developed, so I am not sure what kind of a future it has. And while it has flexible syntax for many things, it still has that programming language feeling to it, which I find distracting in a DSL.

F# - This is a functional language, developed by Microsoft, which is supposed to ship in the future. F# supports language oriented programming. I have skimmed over that language very briefly. Although the power of F# is impressive, from my point of view, it rather looks like BNF definitions than anything else. Unquestionably this is a problem with the author's lack of experience with functional programming languages, but I just do not consider it readable.

We are done with the languages developed by Microsoft. Let us look a bit further afield. At last count the CLR boasted over a hundred languages, so I am going to pick just two, which I deem to be valuable candidates for DSL host languages.

Nemerle is a multi paradigm language (OO and Functional), with full support for compiler macros (of the Lisp variety, not the C++ ones; more on that later), and a host of other things that makes it a good target for a host language for a DSL. Not the least for the simple reason that I can actually read Nemerle code (well, most of the time).

Boo is a statically typed OO language with a Python based syntax. It supports Macros (again, of the Lisp variety), an open compiler pipeline and has specific features that were explicitly designed to make DSL building easier. Boo is my preferred language for building DSLs, but in order to preserve at least a semblance of objectivity, we need to review the next subject before I could prove to you how powerful Boo is.

What about the DLR?

So far, I have skipped talking about the Dynamic Language Runtime, which is a Microsoft project to get dynamic languages (such as Ruby, Python and EcmaScript, to mention a few) on the CLR.

More specifically, when people are thinking about the DLR, they are thinking about IronRuby and IronPython.

Ruby is a language that has proven to be well suited for writing an internal DSL, and running it on the CLR will ensure that we are working in a familiar environment.

Using the DLR as a platform for a DSL is certainly possible, but I would avoid it, at least for the time being. The DLR, and IronRuby itself, are still work in progress. I don't think that there is any commitment for a release date, yet. Furthermore, I haven't found much that I could do in Ruby that I couldn't do in Boo, and I find Boo's meta programming facilities both very natural and extremely powerful.

What do I mean by "natural and extremely powerful"?

Let us examine Boo in a little more depth. I said that it has an open compiler. I didn't mean that it is Open Source (it is, but that is not relevant), I meant that you have a way to reach into the compiler and start messing around with the compiler's internal object model during compilation. This means that we can change the way the compiler behaves in an interesting fashion.

The two code samples above are both Boo DSL code.

Getting into the full details of Boo's meta programming facilities is out of scope, but I think that I can show off a simple sample that will demonstrate its power.

The CLR has the notion of IDisposable, and the using statement to go with it. Right now I am going to define an ITransactionable, and define a transaction statement that will go with it.

public interface ITransactionable:
def Dispose():
pass
def Commit():
pass
def Rollback():
pass

macro transaction:
return [|
tx as ITransactionable = $(transaction.Arguments[0])
try:
$(transaction.Body)
tx.Commit()
except:
tx.Rollback()
raise
finally:
tx.Dispse()
|]

With just this code, we can start using the transaction statement, as a first class language element (in fact, this is exactly how the using statement is implemented in Boo).

transaction GetNewDatabaseTransaction():
DoSomethingWithTheDatabase()

Now, if the code inside the transaction will raise an exception, the transaction will be rolled back automatically. If it is successful, it will be committed automatically.

But that is just a demonstration of what you can do with the language. And note that the only new concept that I am introducing here is macro, and the funny [| |] symbols. Without getting too deep into it, this instructs the compiler to do a syntactic replacement of the code inside a transaction block with the content of the macro.

It is important to note that this goes beyond text substitution, we are modifying the AST (Abstract Syntax Tree - the compiler object model). This is a trivial (but powerful) example. We will explore a more complex scenario in a short while, which will show us why this distinction is important.

For building a DSL, even this level is not often needed. You can get by without using any of the meta programming options by just using the language syntax. Boo, akin to Ruby, has a lot of optional syntax, which is very useful in many scenarios. For example, we could create the same syntax without resorting to meta programming, but utilizing a feature of Boo's syntax, which allows us to pass a block of code to a method, if the last parameter is a delegate (closure, block, etc).

For instance:

def transaction(tx as ITransactionable, transactionalAction as ActionDelegate):
try:
transactionalAction()
tx.Commit()
except:
tx.Rollback()
raise
finally:
tx.Dispse()

And we can still use this code, just as we did before:

transaction GetNewDatabaseTransaction():
DoSomethingWithTheDatabase()

From the point of view of the syntax, there is no difference. There is a subtle difference between the two versions, however. The CLR ensures that if the instruction preceding a 'try' block succeeds, then the 'try' block will be entered. This is critical for the correct work of using() statements, as well as many other scenarios.

The first version could take advantage of this capability. But the second cannot. (The reason for that is that the second version involves a method call at runtime, while the first one will simply replace the transaction block with the modified results).

What else can we do with Boo's meta programming? Quite a lot, you could write a book about it (and in fact, I do write a book about it :-) ). As a simple example, and not necessarily the best example of good design, you can modify the semantics of the 'if' statement in the language.

I had to do that once, when I changed 'if' statements in the following pattern:

if foo == null:
# do something

To this pattern:

if foo == null or foo isa NullObject:
# do something

Now, whenever we ask for null, we also check if the object is an instance of NullObject, which is a custom type in my application. This allows me to use the NullObject pattern in a natural manner, through the application. This means that the following code sample will print "Value is null":

val = NullObject() # set val to a new instance of NullObject
if val == null: # will be compiled as val == null or val isa NullObject
print "Value is null"
else:
print "Value is not null"

We have extended the language to consider all objects inheriting from NullObject as null.

The ability to go in and change such a fundamental part of the language made my work (and the usage of the language) much easier in the long run.

One last example, before we move on. I want to show you how you can add a (simplistic) 'design by contract' (class invariant) to Boo application, in about 20 lines of code. Here it is:

[AttributeUsage(AttributeTargets.Class)]
class EnsureAttribute(AbstractAstAttribute):

expr as Expression

def constructor(expr as Expression):
self.expr = expr

def Apply(target as Node):
type as ClassDefinition = target
for member in type.Members:
method = member as Method
continue if method is null
block = method.Body
method.Body = [|
block:
try:
$block
ensure:
assert $expr
|].Block

And the usage:

[ensure(name is not null)]
class Customer:
name as string

def constructor(name as string):
self.name = name

def SetName(newName as string):
name = newName

Now, any attempt to set the name to null will cause an assertion exception. This technique is quite powerful, and very easy to use. I'll leave writing the precondition attribute as an exercise for the reader.

This sample also demonstrates the power of working directly with the compiler object model (AST). We aren't limited, as in C++ macros, to text substitution. We can query the object model and modify it in a very natural manner. Well, by now I think that I ought to have convinced you that Boo is an awesome language, and that it is very suitable for building your DSL. I have literally just skimmed the very surface of its potential. There is a lot more to find out.

A few other advantages: Boo is a statically compiled language, which means that your DSL will have all the advantages of standard CLR code (JIT, GC, debugging, etc). From a performance perspective, there isn't any difference between your DSL code and your application code.

Thus Boo based DSLs are ideal candidates for code sections that both need to be changed often and require high performance. The common requirement of having to change things in production often pushes people towards XML based systems, rule engines, etc. Even without considering the whole "let us program in XML" debate, those choices suffer from poor performance.

Building a system that makes use of a set of DSL scripts is easy, offers high performance, and remains highly maintainable in the long run. It also meshes well with the idea of Domain Driven Design, since having a Domain Specific Language make it easy to express the concepts of the domain naturally.

There are several publicly available Boo DSLs.

My personal favorite is Binsor. Binsor is a configuration DSL for the Castle Windsor IoC container, which makes working with advance IoC concepts a breeze. You can learn more about Binsor by visiting the Binsor 2.0 announcement. Other interesting DSLs in Boo are:

Specter is a behavior driven development (BDD) framework that provides a very natural syntax for writing the specification, and translates the specification to a standard NUnit test.

Brail is a text templating language.

There are a few others, but they are targeting a much smaller niche, and are not widely known.

Writing a DSL requires some initial knowledge, but that knowledge is straightforward and quickly acquired. Once you have this base knowledge, you can get to the point where writing a DSL is about as hard as writing a form.

In fact, I have written a back-end processing system which was composed mostly of DSLs to process messages from various sources. In that system, I was writing DSLs about as often as I would have written forms in a presentation layer.

In conclusion, Boo is a really nice language to build a DSL in. Using Boo for writing a DSL tends to reduce the cost of writing a DSL significantly, without compromising on flexibility or performance. In addition it offers the syntactic freedom and ability to express your domain concepts in a "natural" way.

And as a parting shot, Boo runs on Java as well.

About the author

Oren Eini, better known as Ayende Rahien, is an experienced .NET developer and architect. He is a well known contributor of several open source projects, such as NHibernate and Castle. In addition Ayende is the founder of Rhino Mocks, Rhino Commons, and NHibernate Query Analyzer. Regarding Boo Ayende has created Brail, a templating language for Castle MonoRail, Binsor, a DSL for configuring the Castle Windsor IoC container, and he is writing a book titled Building Domain Specific Languages in Boo".

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Good article - with one exception by Al Tenhundfeld

I really enjoyed the content of the article, especially the reference to Binsor. The last time I looked at Boo, I thought it was neat but had a long way to go. Maybe it's time I check it out again.



I do have one minor complaint about the article. It is full of typos and unnatural grammar. I wouldn't comment on it, but I see that you're writing a book, which sounds interesting. I hope that your book has better editing.



Examples:

"They simple have different styles and approaches for expressing their domain."

"Quite a lot, you could write a book about it (and in fact, I do write a book about it :-) )."

Re: Good article - with one exception by xy dan

Excellent article. I don't mind the unnatural grammar. This could be outsourced for better published articles though. Kevin.

Good article by Vijay Santhanam

I read your blog, and have followed your fascination with Boo + DSL.
The validation examples you show are really cool.

I'm still waiting impatiently for the book though to see a larger range of DSLs - from near natural language all the way to visual languages.

Awesome by shadi mari

Good as always Ayende

Minor improvement in transaction macro by Marcus Griep

This would be a minor improvement, but to avoid touching the exception stack, you can use Boo's support of the fault handler through the "failure" keyword, modifiying the implementation of the transaction macro by substituting

failure:
tx.Rollback()

for

except:
tx.Rollback()
raise


The fault handler is not supported in C# or VB.Net, but has the benefit of allowing you to perform actions when an exception goes unhandled, such as clean up, that would be handled differently if no exception were thrown.

There is yet another one DSL-related project by Geek Metaprogrammer

MBase compiler prototyping framework:
MBase technology preview.



As an example of small DSL using this framework:
Minim compiler

In my opinion this might be the most strategic article for .NET developers by Damon Wilder Carr



Damn this is a amazing article.... I wish anyone else (i.e. the statistic average dev in .NET) could care less. This is our shame this is our burden. My approach is not cater to lack of knowledge if it should be assumed. So I make a lot of enemies (grin)...





// cannot use generics (grin)...Screws up the post..
using (var ctx = Context.Resolve(typeof(IHonestyService))) {

//FYI all I can speak to is business. I'm sure pure science,
//research, etc. is not suffering like this but who knows).

}


I might be wrong, but think for a moment on how software engineering 'concepts' that leave their implementation to the 'reader' work for the average .NET developer.



As a .NET camper I know the answer I assert better then most due to my early dive using .NET 1.0 in commercial global software dev as CTO with 90 devs. I was in the pre-camp(s) as well. I've fought the same wars here and now they have reached the point where those knowing better (myself included) had to form:



'The .NET Club of the 'how could you not know that!' : ALT.NET

Of course I belong!








We often must do the Winston Churchill as not even the people in 'charge' are an assumed baseline to act in their own best interests to not accept the horror I see. I get the entire 'we cannot handle the change on a human level' but how long will you say that? It's been... 10 years? Always?




"Make others do what you want but make them think it was their idea" is so relevant, as they will still argue things like 'design patterns are for abstract overpaid wannabe 'real coders'"




I hear that less then a week ago.... I kid you now...




that are abstract (design patterns?) have caught on by all but the best (i.e. base level in Java) I can assert in working all over the US with many, many .NET teams. It's never been in the culture, even in C++ WinAPI/ATL camps. I remember when I was there and so many wrote C in C++.



This is why ALT.NET was important but sad. Any I am 100% in the .NET camp so I can say this.




Microsoft is all too aware of this as the DSL work has not moved forward (in spite of the amazing talent led by Jack Greenfield).



I would guess their approach was at that time (and it continues but I hope a converged and integrated combined DSL in the 'internal extended C#' will help.



WAIT....



It already has I assert (and one few area Oren and I disagree). I have been using C# 3.0 successfully for DSL Development because of Linq. Sure there are lots of language niceness but they can mostly be tied in their existence to Linq. So Microsoft already has a great and expanding internal language (and you could almost use any language with varying levels of discomfort and invention).



PLEASE CONNECT C# 4.0 in all the fundamentals ways needed to allow both the Graphical and internal to live, evolve, be tested, and everything else Oren blogs about and myself and my group are also entrenched in.



Everyone outside of the lower end .NET world (which is the world's largest community of developers is it not)???? Anyone have stats? seems to know that.... >


Again ALT.NET is written off as a radical group of 'pie in the sky' architects by the mass market, which conveniently allows the actual truth that they are mostly incompetent to be avoided.




It allows a belief that if there was a big tug-of-war metaphorically against the insanely smaller of us who cannot change our love of what we do, they would find their arms soon free of sockets and it harm us as much as them.



Nothing new about the few in software being orders of magnitude 'better'. Peopleware anyone? But I assert forces now have made that disparity so much larger for us due to external forces lacking to get the others motivated. Anyway.. Rant end.




So given this, not only is Oren (Ayende) working on perhaps the most important project for .NET right now (Linq to NHibernate), he's also driving forward the most strategic ideas in solving issues so far beyond just code, just language, just algorithms, he is by definition moving solutions forward on 'communication', 'team/domain expert', allowing true 'domain-driven' stories to happen, and moving us into the realm of the strategic big problems in business.









This article represents the best content anywhere period for those it targets.



Damon Wilder Carr
blog.domaindotnet.com

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

7 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT