Paradigm based Polyglot Programming
How many languages are you using on the same project? If you go counting you will see that they are many. I mean XML, Java, XSLT, HTML, CSS... etc. But the reason why you are using almost all of them is that they happen to be mainstream and, oftentimes, they are the only language choice for a needed framework. You are actually almost obliged to use them. The choice is done for you. Style? CSS. Configuration? Often XML. Web interface description? Html. However, if you want to adopt true polyglot programming, you will have to face inevitable decision of language choice.
To be able to do the right decision one should keep in mind that the main reason of adopting polyglot programming is to be able to choose the right programming language for the domain problem at hand. But then, the question is how to choose the right language for a given domain or sub-domain?
First of all, I guess it is crucial to understand properties of the available languages (available in terms of production environment and all the other parameters that limit languages choice in an enterprise project). Understanding language properties does not only mean understanding its pros and cons but, more importantly, it means understanding how does the language describe (model) the world. This is not an easy task because it requires rather deep analysis of the nature of the language.
Understanding Available Programming Languages
A programming language can be viewed as a limited set of vocabularies and rules that are then composed to describe a particular problem. Our ability to better describe the problem we are specifying very much depends on the available constructs and concepts the language offers. This means that quite important aspects of a programming language are: 1) the set of available vocabularies and rules and 2) the existing composition rules.
Paradigm of the Programming Language
A coherent set of vocabularies together with rules of composition produce a paradigm that defines a great part of the language’s nature (or the nature of a subset of a language in a multi-paradigm programming language).
Paradigm is the most important property to consider about the available languages. It is essential to use the proper paradigm to produce concise readable code. Using the right paradigm helps keeping the correspondence between the problem domain and the software model, thus allowing to create a clearer model and yielding more readability. Using the wrong paradigm, on the contrary, leads almost inevitably to amounts of ad-hoc code, hacks and explosion of code (a large number of code lines for a relatively straightforward task).
Examples of programming paradigms
Let's take for instance Xml. Xml can be described as a set of elements where each element has properties (or tags). These elements can be nested within others so that an element can contain other elements and properties can be set for each of them. This probably tells sufficient information about the nature of the XML language and its compositional nature. In short, this can tell us enough about the language paradigm to be considered later in the domain analysis phase (cf. infra). With this, we know indeed that Xml has a nested structure; its constructs consist of first class elements with second class properties.
Another example is Lua. Its rather interesting structure is based on tables. Lua can be described as a first class key/value tables that can contain other tables. So, obviously, it has a kind of tabular shape that we can, arguably, call tabular paradigm. This too tells enough about the paradigm, enough at least to help us with first fast analysis.
We can look at all other languages in the same way. Lisp major data structures are lists and programming in Lisp can be conceived as list manipulation that can in turn be viewed as the underlying paradigm of this language. Lisp is also a functional programming language, in which first class functions can be composed to produce yet others, and this reflects functional paradigm. Haskell is also a functional programming language but it is as well characterized by quite mathematical nature. This strongly typed language is known for its powerful types that are an important abstraction tool, which makes Haskell quite a safe language with considerable abstraction power. Prolog is mostly composed of boolean predicates and assertions and can be described, as its name reveals, as a logic based programming language.
Other programming language properties to consider
Having considered the paradigm, there are a few other properties to take into account when choosing your language for a particular sub-domain. Typing type (dynamic/static), syntax and other are less important than language paradigm, they are yet to be considered. Providing more insights about these properties is out of the scope of this article. One advice would simply be not to overrate their importance and to put more focus on the paradigm which really plays an important role for code expressiveness of the sub-domain of interest.
Discovering and defining sub-domains of different nature
While considering polyglot programming, one has most probably already made the decision that several languages can possibly be used for the domain of the concerned project. Usually, this decision comes from. Usually, this decision comes from two things; awareness that certain languages fit better certain kinds of problem and an observation that the given domain is composed of different parts possibly having different nature.
Consequently, the problem domain can be partitioned into sub-parts (or sub-domains). This partitioning yields right away a good level of distinction and a rather explicit differentiation of sub-domains. These sub-domains, however, still need to interact, but the difference of their nature should not be ignored when implementing this interaction.
Once the domain is divided properly into sub domains, the task of finding correspondence between the domain and the programming language becomes easier. A domain problem is usually too large to be looked at from one angle. Partitioning gives the opportunity to use different tools with different sub-domains, thus supporting better abstraction and design by using the right tool for the problem.
Identifying sub-domains and their nature is not easy. It requires analysis skills and a thorough understanding of the target domain. It is instrumental to do this task in narrow collaboration with domain experts to avoid drawing false conclusions about the nature of sub-domains. This task is largely about observing properties and interactions of different modules, and abstracting the style of interaction and the existence of few concepts that we can describe as the paradigm or the nature of the sub-domain. There are several techniques and practices that help learning the domain and discovering its nature as, for instance, Domain Crunching in Eric Evans' Domain-Driven-Design and Multi-Paradigm Design of James Coplien.
While keeping in mind available programming paradigms can help formulating domain concepts in accordance with available modeling constructs, it should not affect the domain learning in a way to constrain the problem analysis and to bend domain concept to fit into preferred paradigm. This issue is not uncommon in OOP when domain concepts are by default considered to be objects.
Choosing the right language/paradigm for a sub-domain
Having identified sub-domains, now it is time to choose for each of them the right programming language. In a few words, the right programming language is the one that will allow writing the sub-domain problem description in the most fluent way. While programming, one usually starts thinking under the influence of the language and through its concepts. Some programming languages make the code for the sub-domain obscure and full of noise, but with the right paradigm one can use the correspondence that exists between the sub-domain and the programming language concepts to produce more concise and extensible code. Extensibility is one of the important positive side effects of paradigm match. The domain would not suddenly change its nature even if it evolves. Hence, it will most probably do so along the lines of the same paradigm. If the language nature matches this paradigm, one can assume that this would reduce the cost of change induced by the domain evolution and limit its scope.
Examples of sub-domain - programming paradigm correspondence
Graphical User Interface
Let's take Graphical User Interface. As we look at an interface, what we see is panels inside other panels. These panels have graphical components inside them (buttons, photos, blocks) that can have other controls, which themselves can have other controls. From this brief analysis, we realize that graphical user interface has a nesting nature. Things can be nested inside other things. Also any of its components has properties. Given that, it seems like xml is a perfect fit for GUI.
However, I guess our conclusion was drawn a bit too fast. I guess a good part of a GUI can be represented by xml, but not necessarily everything. WPF and XAML is a good case in our study. XAML got an XML based syntax which, as we said, goes along with the nesting nature of a GUI. However, binding elements, for instance, is done using special syntax that is not based on xml. But there are other sub-domains of graphical interface description that could profit from using syntax that is more suitable than xml. I think of styles for example. Writing styles with xml adds a lot of noise because of the opening and closing tags that don't really represent any meaning for this sub domain. A better representation of style should take more into consideration their nature. But what is the nature of graphical style description? A style in GUIs is merely a set of properties with their values that are identified by a special name. Moreover, the style does not care where the property happens to be; in the component, or a sub component or even deeper. A style is just a description of properties. If a component has a given property, it will take the description into consideration. Otherwise it is simply ignored and does no harm. After this rather simple analysis it became almost obvious why CSS offers a good choice of syntax for the sub-domain of graphical styles.
Streams and data workflow processing
Imagine we've got a flow of actions, information or data that need to be processed. Mostly, this stream of things is virtually endless, and it rather needs to be adapted and processed to produce a series, another stream of action or information based on the first one. Instances of this are RSS feed, http requests/responses, network signals and a lot of other stream based domains. Here the example is already chosen to have a stream oriented paradigm, but it is not too hard to figure out this nature with enough domain analysis. Choosing technology that shares this nature is instrumental for reducing code clutter and improving code’s readability and quality. Otherwise you can find yourself writing a lot of nested loops and ‘while’ constructs and that is where modularity starts fading out.
Functional Programming in general and in particular lazy languages like Haskell, have an interesting nature for dealing with infinite lists and streams. Functions can be sequenced to form an explicit workflow that yields a more modular solution for stream-oriented paradigms.
More examples of paradigm fit for a sub-domain
There are a lot of other examples to where domain and technical solution paradigms fit is essential when choosing tools. Domain with A domain with a lot of rules and validation predicates can take much advantage of predicated oriented or logic languages like Prolog and Haskell. Concurrent programming language like Erlang can be a best fit for concurrency based sub-domains and domains that use a lot of string processing can benefit from programming languages that have a powerful string processing paradigm.
Multi-Paradigm Programming Languages
Some programming languages do not offer a single but rather a selective mix of paradigms. If it represents a good opportunity to optimize paradigm match, it doesn’t categorically replace Polyglot Programming. Both approach to multi-Paradigm design have pros and cons. On the one hand, for instance, in mono paradigm languages, syntax is optimized for best expressiveness of the concerned paradigm. On the other hand, in multi-paradigm languages, integration of several paradigms is thought of while designing the language itself. The latter can be both an advantage as it makes it easier to use several paradigms and a disadvantage, making this interaction implicit and less intention revealing. Multi-paradigm programming languages tend to be complex, but some of them benefit from being already mainstream (like C# and C++).
These tradeoffs and a lot of others should be considered when choosing between the two techniques. When it is not really possible to use several languages because of production environment (this is becoming less and less valid with implementations of a lot of interesting languages on several mainstream platforms), then multi paradigm language can be a good choice. Team members’ skills can be a reason even if some people, including myself, can argue against its validity and see it as a problem disguise.
Even if some generalizations, such as tier based choice of language, can be motivating for using polyglot programming, their approach is rather simplistic and does not treat directly and effectively the issue for which one might need to use more than one programming language on a project. Thorough domain analysis and multi-paradigm design are central to Polyglot Programming. Matching the programming language paradigm to the sub-domain's nature is the key to more readable, concise and evolution-friendly code that is free from useless noise. Understanding and identifying paradigms of programming languages is the first step towards optimized polyglot programming. Domain Driven Design and Multi-paradigm Design techniques should then accompany programming throughout the project.
About the AuthorSadek Drobi is a software engineer specialized in design and implementation of enterprise applications. Mostly interested in solutions for bridging the gap between business and developers (e.g. agile, DSL, domain driven design) he is currently working on a research proposal with a focus on language oriented programming and multiparadigm design. Sadek works as a consultant at Valtech Consulting. Passionate about his profession but also about photography, he publishes a technical blog at www.sadekdrobi.com and maintains a photo gallery http://photos.sadekdrobi.com
More on polyglot programming from Neal Ford...
Good stuff though...
Re: More on polyglot programming from Neal Ford...
Sure, Neal already wrote a lot of interesting stuff about Polyglot Programming as did Ola Bini and Martin Fowler too. However the focus of this article is more about considering paradigm of the language as the most important property of the programming language when doing domain analyses. The article describes an approach inspired by the referenced James Coplien's work on Multi Paradigm Design and Eric Evans' Domain Driven Design.
Thank you for the reference Jared.
Re: Fascinating subject!
Add to this another dimension
I want to point out two issues that may be missing from the discussion.
My thesis was actually the construction of PLOG, a logic programming language ala Prolog, but Object Oriented (OOPL). Here you have a combination.
To create an interpreter for that, we used C language plus Prolog to build a meta-interpreter (if you want to add to the mix).
Of course, there were two problems there: Find a way to blend PL and OO into the same language (Done by my professor) and a way to integrate both implementation languages into one product.
Some time ago I've heard about the idea in Java to include XML as a native type, that is the language will understand XML not as a product process but as a natural part of the language. Of course the result was not natural at all.
Having both paradigms in the same language is not an easy task, but if not included, you will need to work with round objects using a language for square ones!
That last issue is present all the time you want to work in a polyglot environment, impedance mismatch may appear too. So first thing to solve is how easily is to integrate two paradigms.
Now, DSLs can be used to solve problems in the actual domain of the problem. But you have always two spaces: the problem and the solution space. The problem one is where the client lives, and you may want to have a language in that domain to solve the problem from that perspective. But the solution of the problem requires you to work with technology, which is another domain. Thus you may need to use languages to work with those technologies, and the problem domain language is not appropriate. That is the second problem.
All that adds another dimension to the discussion: you need to know where are you working (space) and how to integrate all languages you use seamlessly.
What do you think?
William Martinez Pomares.
Re: Add to this another dimension
What I want to say is that seams do exist. There existence is at the domain level between sub-domains. We can not, and more importantly should not, ignore them. Considering the right seams can have its good effect on the integrity, extensibility and flexibility of the software.
Thanks for the supporting example about multi paradigm languages from your own experience.
Re: Add to this another dimension
I agree totally with you. That is actually what I'm trying to state above: You are working with several domains, some from the problem space and the others from the solution space. The seam does exist, and it is not difficult to ignore. A bad seam may cause malfunction or impedance mismatch.
William Martinez Pomares
Interesting subject put in words
Dominique De Vito
I can give my own examples about multi-paradigm languages. Let's cite OCaml (from ML family) for example:
- with some stream processing features too
Quite a few paradigms.
More people know its cousin: F# as F# is very strongly based on OCaml as Microsoft said itself.
One could see also :
- JoCaml and Acute for concurrency programming (partially sponsored by Microsoft Research) on top of OCaml.
- OCamlDuce is a merger between OCaml and CDuce, in order to introduire CDuce XML features into OCaml.
Similar languages in Microsoft world do exist : while F# puts functional features into C#, C-omega gives C# some XML features.
One key point here is type inference. It alleviates the burden for mixing paradigms into a single language.
It looks like Microsoft has quite well understood about the importance of type inference, as its main base language, C#, has inherited some type inference features too.
Another example of multi-paradigm stuff could be achieved through translators, just like Java-to-JS translator into GWT.
So, the battle is strong on both sides (polygot programming versus multi-paradigm), and I don't know which one is going to win. IMHO, while I strongly prefer multi-paradigm languages, I try to avoid "Highlander Fallacy" ("there can be ‘only one’") bad effects - here is an example.
I can't say also if having a common VM is a plus, or not for polygot programming. SUN and Microsoft, from my Java point of view, are promoting different sides:
- SUN looks like to promote polygot programming through JRuby, JPython and so on.
- Microsoft looks like to promote multi-paradigm languages through more research and development around C# with F#, C-omega and other languages.
Time will tell. Anyway, it's an interesting stuff, thanks for the post
InfoQ Sep 01, 2015