BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Insights: You don't need your DSL to be English-like

Insights: You don't need your DSL to be English-like

Bookmarks

There is a widespread opinion that a good DSL has to be English-like in order to be readable for non-programmers. Dave Thomas advocates against such approach asserting that DSL are not about getting as close as possible to natural languages. Moreover, he argues that having this as a guiding principle of DSL design can be rather detrimental. He also highlights what he believes is important in DSL design and provides some examples of successful DSL that do not necessarily reed like English.

According to Dave, DSL don’t need to be close to English or any other natural language because they targets a very specific category or users - domain experts – who actually don’t speak a natural language

Domain experts […] are speaking jargon, a specialized language that they've invented as a shorthand for communicating effectively with their peers. Jargon may use English words, but these words have been warped into having very different meanings—meanings that you only learn through experience in the field.

Hence, DSL should reflect this jargon and express the expertise of domain specialists in a concise way. Make for dependency management, Groovy builders for expressing data in code and Active record declaration for data modeling in Ruby are a few successful examples of such DSL that respond to domain experts needs without necessarily being English-like. Even though some statements in Active record declaration may look like English, e.g. has_many or belongs_to, they actually are not: “they are jargon from the world of modeling” and “they have a specific meaning in that context.”

Another important point raised by Dave is that, in his opinion, “domain experts” should not be understood as business users but rather as people who are writing specs. These people are programmers. They do not really need an English-like language. Dave actually believes that the notion of fluent interface is often misunderstood: “the fluency here is programmer fluency, not English fluency. It's writing succinct, expressive code”.

Dave Thomas argues that not only isn’t it necessary trying to get closer to a natural language, but it can also be detrimental. Natural languages are imprecise. This makes their power in the real world but this cannot apply to programming. This is why, “whenever we try to create a DSL that looks like a natural language, we fall short”. However hard one tries, syntax tends to remain “very unEnglish like”. And this gap is rather confusing:

There's a major cognitive dissonance—I have to take ideas expressed in a natural language (the problem), then map them into an artificial language (the AppleScript programming model), but then write something that is a kind of faux natural language.

To illustrate the possible confusion, Dave gives an example of piece code from a test written using the test/spec framework and analyses one expression:

@result.should.be.a.kind.of String

It reads like English. But it isn't. The words are separated by periods, except the last two, where we have a space. As a programmer, I know why. But as a user, I worry about it. In the first example, we write @result.should.be.a.kind_of. Why not kind.of? If I want to test that floats are roughly equal, I'd have said @result.should.be.close value. Why not close.to value?

Trivial details, but it means that I can't just write tests using my knowledge of English—I have to look things up. And if I have to do that, why not just use a language/API that is closer to the domain of specifications and testing?

It is true that English-like DSL may be more readable, but Dave argues that “the attempt to create a natural language feel in the DSL leads to all sorts of leaks in the abstraction”. It might add to readability of code but it would “be taking away from its writability” and “adding uncertainty and ambiguity”:

The second you find yourself writing

  def a
     self
  end

so that you can use "a" as a connector in

add.a.diary.entry.for("Lunch").for(August.10.at(3.pm))

you know you've crossed a line. This is not longer a DSL. It's broken English.

One of commentators, Has, also believes that trying to make a language readable to non-programmers one risks to end up with a "read-only language”. He takes the example of AppleScript. To improve its readability, it was necessary to remove “most of the usual symbolic cues that describe a language's semantics”. As a result, “the syntax effectively obfuscates, not clarifies, the language semantics”. If “it's very easy to read an AppleScript and understand _what_ it does, it's damnably hard to figure out exactly _how_ it does it”.

Has highlights another issue that may result from using an English-like DSL: users might assume that “because it _looks_ like English, it will also _behave_ like it” and “form all sorts of very strong associations and conclusions about its nature, which then have to be undone the long, hard way”. Hence, according to Has, English-like appearance “accidentally encourages unrealistic user assumptions”

If DSL readability and expressiveness are of interest for you, find more examples and comments on Dave’s blog post.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Another example...

    by Mirko Nasato,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    ...of a very popular DSL that certainly does not read like English: regular expressions!



    RE syntax is almost as cryptic as it can get for non-programmers, yet regular expressions are so valuable in their own "domain" (text parsing/searching/foo) that pretty much every major programming language now provides them, either built-in or as part of its standard library.



    Any attempt at writing a more verbose, "fluent" interface to regular expressions would never gain much attention, because conciseness is exactly why REs are so powerful.

  • Re: Another example...

    by James Richardson,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    hamcrest has regular expression builders as a fluent dsl like thing.

    i know of a few production projects that use this package.

    i would certainly recommend using it in any java project where the cost of maintenance & debugging looks like it might be higher than the costs of writing in the first place.

  • Re: Another example...

    by Mirko Nasato,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I'd rather stick to the standard RE syntax (there are plenty of books,
    many developers are familiar with it) and, of course, have a solid test suite for maintanance and debugging.



    Besides, I think that e.g. (for matching a URL)



    ^/item/(\w{12})/edit$


    is actually easier to read than something like (I'm making up the API)




    it.startsWith("/item")
    .followedBy(capturing(exactly(12, WordCharacters)))
    .followedBy("/edit")
    .andEnds();



    just like e.g.




    72 * 5 - 37 * 3



    is more readable (less noisy) than




    72.multipliedBy(5).minus(37.multipliedBy(3))



    and the advantages of a compact syntax become more apparent with more complex cases
    and/or when similar operations are repeated again and again. That's why such syntax was introduced in the first place.



    Verbosity != Readability

  • Re: Another example...

    by Francois Ward,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I so completly agree with you, and its something people miss a lot. However, an argument you could add is tool/ecosystem support, and training costs. There are certain things developers in certain fields should know (depends on the field, of course), like SQL, Regex, Javascript, whatever (again, depends on the field). If you abstract that stuff with an in-house solution, or a more obscure third party solution, when you hire someone, you need to train them to use it, and they may not take too kindly to not being able to use their existing (and portable) knowledge.



    Then tool support: regexs can be copy pasted in Expresso (under Windows anyway), and analysed in a very verbose manner, making them a snap to maintain, even for complex ones. If you don't like Expresso, there's a ton of other tools, and there are tons of references and web sites to help. A custom fluent interface will only have the standard debugger of the given language, the doc of the developer/vendor, and thats it. You're on your own for the rest. If its not an internal solution, you may have the vendor/dev forum too, but its limiting. With normal regex, you can post about it on virtually any developer forum, and someone can help..



    Same with all the other things (and its true in other fields too, not just DSLs) I mentionned above. I, for example, LOVE Object Relational Mappers (Hibernate, LLBLGEN Pro, LINQ to SQL, SubSonic, etc), but when making an app, I have to be careful: some of these tools aren't as known/supported as good old stored procedures, so its a toss up. Again, same deal with DSLs, especially custom internal ones.

  • Sholdn't need to be a complete language at all!

    by William Martinez,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Language means it has a vocabulary and some grammar, but that doesn't mean we need to create a complete natural language at all. It could be used to execute something, or even to explain something to Stakeholders in their own words, and it could be made of just some jargon and a couple of rules.




    If looking to avoid too much thinking on what the vocabulary and grammar should be by reusing English rules, you may find a worst thing. There are examples in the micro-DSL we create in strings (that are sometimes passed as parameters). Michael Stal posted about it in DSL Revisited and I posted some response in Talking about DSL





    Readibility is actually important, since the idea behind DSL is to improve description of a domain in its own words.


    William Martinez Pomares.
    Architect's Thoughts

  • Re: Sholdn't need to be a complete language at all!

    by Sadek Drobi,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I agree with you William about the importance of the expressiveness of a DSL. However, I am not sure that one can reuse English grammers in a programming language. DSL is a programming languages, designing a programming language is not an easy task. Actually that is why I am currently working on a project to emphasize on the programming and coherence part of a DSL than on tricks and techniques to make it more readable.
    I have a wiki that is almost empty for now. I aint got time to fill it :)

  • Re: Sholdn't need to be a complete language at all!

    by William Martinez,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I agree too, Sadek. Just for the sake of clarity (I reread my sentences and it is very confusing), I think that many people try to reuse the English grammar rules to avoid creating new custom rules. But that is incorrect, and may lead you to chaos. On the other hand, I would say you can create a useful DSL for a specific task that is not so sound or complete. That is why many developers may feel scare to try them.


    William Martinez Pomares.
    Architect's Thoughts

  • One need not go far at all...

    by Daniel Sobral,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    So, how many languages use plain english instead of mathematic's DSL? COBOL tried, and there weren't many followers, as far as I know. In fact, I only know of COBOL.

    But let's forget languages, for now. Why do mathematicians and logicians use their own DSL instead of natural language? Because, not to put a too fine point on it, natural language SUCKS for what they are writing.

    And this is what we have to keep in mind. Natural language is actually very weak for abstract concepts. We usually don't notice it because we are naturally bound by the limits our own languages impose on us. Actually, the same happens for people who only know one programming language. :-)

    But it is our job to understand the abstract problems we are facing, and developing an expressive way of "talking" about it.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT