InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Insights: You don't need your DSL to be English-like

Posted by Sadek Drobi on Mar 28, 2008

Sections
Development,
Architecture & Design
Topics
Design ,
Domain Specific Languages ,
Architecture
Tags
ActiveRecord ,
Business Natural Languages

There is a widespread opinion that a good DSL has to be English-like in order to be readable for non-programmers. Dave Thomas advocates against such approach asserting that DSL are not about getting as close as possible to natural languages. Moreover, he argues that having this as a guiding principle of DSL design can be rather detrimental. He also highlights what he believes is important in DSL design and provides some examples of successful DSL that do not necessarily reed like English.

According to Dave, DSL don’t need to be close to English or any other natural language because they targets a very specific category or users - domain experts – who actually don’t speak a natural language

Domain experts […] are speaking jargon, a specialized language that they've invented as a shorthand for communicating effectively with their peers. Jargon may use English words, but these words have been warped into having very different meanings—meanings that you only learn through experience in the field.

Hence, DSL should reflect this jargon and express the expertise of domain specialists in a concise way. Make for dependency management, Groovy builders for expressing data in code and Active record declaration for data modeling in Ruby are a few successful examples of such DSL that respond to domain experts needs without necessarily being English-like. Even though some statements in Active record declaration may look like English, e.g. has_many or belongs_to, they actually are not: “they are jargon from the world of modeling” and “they have a specific meaning in that context.”

Another important point raised by Dave is that, in his opinion, “domain experts” should not be understood as business users but rather as people who are writing specs. These people are programmers. They do not really need an English-like language. Dave actually believes that the notion of fluent interface is often misunderstood: “the fluency here is programmer fluency, not English fluency. It's writing succinct, expressive code”.

Dave Thomas argues that not only isn’t it necessary trying to get closer to a natural language, but it can also be detrimental. Natural languages are imprecise. This makes their power in the real world but this cannot apply to programming. This is why, “whenever we try to create a DSL that looks like a natural language, we fall short”. However hard one tries, syntax tends to remain “very unEnglish like”. And this gap is rather confusing:

There's a major cognitive dissonance—I have to take ideas expressed in a natural language (the problem), then map them into an artificial language (the AppleScript programming model), but then write something that is a kind of faux natural language.

To illustrate the possible confusion, Dave gives an example of piece code from a test written using the test/spec framework and analyses one expression:

@result.should.be.a.kind.of String

It reads like English. But it isn't. The words are separated by periods, except the last two, where we have a space. As a programmer, I know why. But as a user, I worry about it. In the first example, we write @result.should.be.a.kind_of. Why not kind.of? If I want to test that floats are roughly equal, I'd have said @result.should.be.close value. Why not close.to value?

Trivial details, but it means that I can't just write tests using my knowledge of English—I have to look things up. And if I have to do that, why not just use a language/API that is closer to the domain of specifications and testing?

It is true that English-like DSL may be more readable, but Dave argues that “the attempt to create a natural language feel in the DSL leads to all sorts of leaks in the abstraction”. It might add to readability of code but it would “be taking away from its writability” and “adding uncertainty and ambiguity”:

The second you find yourself writing

  def a
     self
  end

so that you can use "a" as a connector in

add.a.diary.entry.for("Lunch").for(August.10.at(3.pm))

you know you've crossed a line. This is not longer a DSL. It's broken English.

One of commentators, Has, also believes that trying to make a language readable to non-programmers one risks to end up with a "read-only language”. He takes the example of AppleScript. To improve its readability, it was necessary to remove “most of the usual symbolic cues that describe a language's semantics”. As a result, “the syntax effectively obfuscates, not clarifies, the language semantics”. If “it's very easy to read an AppleScript and understand _what_ it does, it's damnably hard to figure out exactly _how_ it does it”.

Has highlights another issue that may result from using an English-like DSL: users might assume that “because it _looks_ like English, it will also _behave_ like it” and “form all sorts of very strong associations and conclusions about its nature, which then have to be undone the long, hard way”. Hence, according to Has, English-like appearance “accidentally encourages unrealistic user assumptions”

If DSL readability and expressiveness are of interest for you, find more examples and comments on Dave’s blog post.

Another example... by Mirko Nasato Posted
Re: Another example... by James Richardson Posted
Re: Another example... by Mirko Nasato Posted
Re: Another example... by Francois Ward Posted
Sholdn't need to be a complete language at all! by William Martinez Posted
Re: Sholdn't need to be a complete language at all! by Sadek Drobi Posted
Re: Sholdn't need to be a complete language at all! by William Martinez Posted
One need not go far at all... by Daniel Sobral Posted
  1. Back to top

    Another example...

    by Mirko Nasato

    ...of a very popular DSL that certainly does not read like English: regular expressions!



    RE syntax is almost as cryptic as it can get for non-programmers, yet regular expressions are so valuable in their own "domain" (text parsing/searching/foo) that pretty much every major programming language now provides them, either built-in or as part of its standard library.



    Any attempt at writing a more verbose, "fluent" interface to regular expressions would never gain much attention, because conciseness is exactly why REs are so powerful.

  2. Back to top

    Re: Another example...

    by James Richardson

    hamcrest has regular expression builders as a fluent dsl like thing.

    i know of a few production projects that use this package.

    i would certainly recommend using it in any java project where the cost of maintenance & debugging looks like it might be higher than the costs of writing in the first place.

  3. Back to top

    Re: Another example...

    by Mirko Nasato

    I'd rather stick to the standard RE syntax (there are plenty of books,
    many developers are familiar with it) and, of course, have a solid test suite for maintanance and debugging.



    Besides, I think that e.g. (for matching a URL)



    ^/item/(\w{12})/edit$


    is actually easier to read than something like (I'm making up the API)




    it.startsWith("/item")
    .followedBy(capturing(exactly(12, WordCharacters)))
    .followedBy("/edit")
    .andEnds();



    just like e.g.




    72 * 5 - 37 * 3



    is more readable (less noisy) than




    72.multipliedBy(5).minus(37.multipliedBy(3))



    and the advantages of a compact syntax become more apparent with more complex cases
    and/or when similar operations are repeated again and again. That's why such syntax was introduced in the first place.



    Verbosity != Readability

  4. Back to top

    Re: Another example...

    by Francois Ward

    I so completly agree with you, and its something people miss a lot. However, an argument you could add is tool/ecosystem support, and training costs. There are certain things developers in certain fields should know (depends on the field, of course), like SQL, Regex, Javascript, whatever (again, depends on the field). If you abstract that stuff with an in-house solution, or a more obscure third party solution, when you hire someone, you need to train them to use it, and they may not take too kindly to not being able to use their existing (and portable) knowledge.



    Then tool support: regexs can be copy pasted in Expresso (under Windows anyway), and analysed in a very verbose manner, making them a snap to maintain, even for complex ones. If you don't like Expresso, there's a ton of other tools, and there are tons of references and web sites to help. A custom fluent interface will only have the standard debugger of the given language, the doc of the developer/vendor, and thats it. You're on your own for the rest. If its not an internal solution, you may have the vendor/dev forum too, but its limiting. With normal regex, you can post about it on virtually any developer forum, and someone can help..



    Same with all the other things (and its true in other fields too, not just DSLs) I mentionned above. I, for example, LOVE Object Relational Mappers (Hibernate, LLBLGEN Pro, LINQ to SQL, SubSonic, etc), but when making an app, I have to be careful: some of these tools aren't as known/supported as good old stored procedures, so its a toss up. Again, same deal with DSLs, especially custom internal ones.

  5. Back to top

    Sholdn't need to be a complete language at all!

    by William Martinez

    Language means it has a vocabulary and some grammar, but that doesn't mean we need to create a complete natural language at all. It could be used to execute something, or even to explain something to Stakeholders in their own words, and it could be made of just some jargon and a couple of rules.




    If looking to avoid too much thinking on what the vocabulary and grammar should be by reusing English rules, you may find a worst thing. There are examples in the micro-DSL we create in strings (that are sometimes passed as parameters). Michael Stal posted about it in DSL Revisited and I posted some response in Talking about DSL





    Readibility is actually important, since the idea behind DSL is to improve description of a domain in its own words.


    William Martinez Pomares.
    Architect's Thoughts

  6. Back to top

    Re: Sholdn't need to be a complete language at all!

    by Sadek Drobi

    I agree with you William about the importance of the expressiveness of a DSL. However, I am not sure that one can reuse English grammers in a programming language. DSL is a programming languages, designing a programming language is not an easy task. Actually that is why I am currently working on a project to emphasize on the programming and coherence part of a DSL than on tricks and techniques to make it more readable.
    I have a wiki that is almost empty for now. I aint got time to fill it :)

  7. Back to top

    Re: Sholdn't need to be a complete language at all!

    by William Martinez

    I agree too, Sadek. Just for the sake of clarity (I reread my sentences and it is very confusing), I think that many people try to reuse the English grammar rules to avoid creating new custom rules. But that is incorrect, and may lead you to chaos. On the other hand, I would say you can create a useful DSL for a specific task that is not so sound or complete. That is why many developers may feel scare to try them.


    William Martinez Pomares.
    Architect's Thoughts

  8. Back to top

    One need not go far at all...

    by Daniel Sobral

    So, how many languages use plain english instead of mathematic's DSL? COBOL tried, and there weren't many followers, as far as I know. In fact, I only know of COBOL.

    But let's forget languages, for now. Why do mathematicians and logicians use their own DSL instead of natural language? Because, not to put a too fine point on it, natural language SUCKS for what they are writing.

    And this is what we have to keep in mind. Natural language is actually very weak for abstract concepts. We usually don't notice it because we are naturally bound by the limits our own languages impose on us. Actually, the same happens for people who only know one programming language. :-)

    But it is our job to understand the abstract problems we are facing, and developing an expressive way of "talking" about it.

Educational Content

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.

Beauty Is in the Eye of the Beholder

Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.

Architecting Visa for Massive Scale and Continuous Innovation

John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.

Max Protect: Scalability and Caching at ESPN.com

Sean Comerford unveils ESPN.com’s architecture, what components are used and why, and the current changes the website goes through.

The Seven Deadly Sins of Enterprise Agile Adoption

Are there repeated patterns of failure on Enterprise Agile Enablement efforts? Sanjiv and Arlen discuss Seven Deadly Sins to avoid when adopting Agile in an enterprise.

Questions for an Enterprise Architect

Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?

Wrap Your SQL Head Around Riak MapReduce

Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.