Study: Clojure, CoffeeScript and Haskell Are the Most Expressive General-purpose Languages

| by Abel Avram Follow 12 Followers on Mar 28, 2013. Estimated reading time: 3 minutes |

According to a study, the most expressive general-purpose languages are Clojure, CoffeeScript and Haskell. The study uses LoC/commit as the measuring unit of expressiveness.

Donnie Berkholz, a RedMonk's resident PhD, has conducted a study meant to quantify the expressiveness of various programming languages. The study is based on data provided by Ohlol, a repository keeping track of over 500,000 open source projects written in about 100 languages spanning around 20 years.

Berkholz used as the expressiveness measuring unit LoC/commit, adding that he started from the assumption that “commits are generally used to add a single conceptual piece”. Also, the results are not a measure of maintainability or productivity, nor telling how readable is the resulting code or how long it takes to write it.

The following graphic shows the expressiveness of over 50 languages which are colored based on their popularity according to RedMonks’s language rankings published earlier this year: red - most popular languages, blue – 2nd tier in popularity, and black – 3rd tier (click to enlarge).


Each language has LoC/commit distributed over a range since the study covers many different projects/language, each with its own average. Languages are ranked by their median - the black line inside the box representing LoC/commit for 50% of the corresponding projects –, the bottom and the top of the box represent 25% and 75% of the projects, while the whiskers go down to 10% and up to 90%.

Some of Berkholz’ conclusions are:

Third-tier languages are heavily biased toward high expressiveness.

Functional languages tend to be highly expressive.

Domain-specific languages are biased toward high expressiveness.

Compilation does not imply lower expressiveness.

CoffeeScript (#6) appears dramatically more expressive than JavaScript (#51), in fact among the best of all languages.

Clojure (#7) is the most expressive of Lisp variants.

Although Go (#24) is getting increasingly hot, it’s not outstandingly expressive. … Despite that, it does trump all the tier-one languages, so someone who only had experience with them could certainly see an improvement when trying Go.

The conclusion that “Third-tier languages are heavily biased toward high expressiveness”makes one wonder why highly expressive languages do not become popular? Does their conciseness make it difficult for the average programmer to grasp and use such languages? Are there other reasons?

Berkholz also ranked languages based on their expressiveness consistency, measured by the height of the box, resulting in the next graphic (click to enlarge):


Berkholz’ conclusions are:

Tier-one languages put in a much stronger showing here.  

Tier-one languages tend to be remarkably consistent, regardless of their expressiveness.

This suggests that a primary characteristic of a tier-one language is its predictability, even more so than its productivity.

Tier-three languages make a poorer showing here.

Java turns in the strongest performance of “enterprisey” languages (C, C++, Java).

CoffeeScript is #1 for consistency, with an IQR spread of only 23 LOC/commit compared to even #4 Clojure at 51 LOC/commit.

Based on expressiveness consistency result and Redmonk’s ranking on language popularity, Berkholz’ concludes that Clojure, CoffeeScript and Haskell are most expressive high-purpose languages. His study is partially backed up by another study conducted by David R. MacIver which interviewed 2576 programmers using the Hammer Principle. According to Maclver, the most expressive languages are Haskell, Clojure and Scala while the least expressive are C, PHP and ultimately TCL. Maclver’s study did not include CoffeeScript.

Berkholz post triggered a large number of comments both on his original post, Hacker News, and Twitter, many considering that LoC/commit does not accurately represent the expressiveness of a language, expressiveness should consider code readability and maintainability, DSLs should not be included, and many others.

Berkholz insists that his study is not about language readability and maintainability, but “rather something about the state of the code in the repository, the development practices in use, potentially the level of bugs you're likely to get (given the correlation between bugs and LOC)”, explaining in greater detail in a separate post why he used LoC/commit to measure expressiveness.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

This is nonsense by Andrew Webb

It is a shame that this "study" was published in the first place, and even more of a shame that it is getting repeated elsewhere. The only conclusion it is actually possible to draw from the LoC/commit metric is how many lines of code there was per commit. There is no connection proven (or even assumed in most cases) between commit size and expressiveness. He offers no proof for the assumption that such a relationship exists.

This is not science, this is nonsense.

Re: This is nonsense by Abel Avram

The study start from the assumption that “commits are generally used to add a single conceptual piece”. We may dispute that assumption, and in some cases it is certainly not true, but, interestingly enough, the study has similar results with the Hammer one based on interviewing 2,576 developers. They may not be 100% accurate, but they are not far from it either.

Re: This is nonsense by Xie Fei


Absurd by Jim White

The authors really should have re-examined their metric when Scheme, LISP, and Clojure have "expressiveness" rankings clear across the spectrum. Whatever it is they're measuring (which probably has a lot to do with project age, codebase size, and programmer habits such as commit frequency) has very little to do with how much code it takes to implement some functionality.

Haskell is definitely expressive! by Roopesh Shenoy

I don't know about other languages or this study itself, but I've recently started learning Haskell, and as a C# developer, I can definitely see how much more expressive it is!

Re: Haskell is definitely expressive! by Faisal Waris

I agree. Haskell is one of the most expressive languages that I know of.

However occasionally you can come across a line of Haskell code that is so dense that it takes 10's of minutes to mentally parse it. This is probably not an issue for accomplished Haskell programmers, though.

The fact that expressiveness of statically typed Haskell ranks close to dynamically typed LISP and derivatives is doubly impressive.

Re: Haskell is definitely expressive! by Roopesh Shenoy

Yes - I love the fact that haskell is statically typed and it's type system is just freakin awesome! I'm still not fully used to thinking in those terms, but coming back to C# just feels like going back to stone-age.

Re: Haskell is definitely expressive! by Faisal Waris

As a pure functional language Haskell is excellent for learning functional thinking. However, Haskell's 'laziness' sometimes leads to unpredictable memory usage. And, let's face it, sometimes you do need mutability.

Scala and F# are good practical alternatives to Haskell, especially because each can integrate with large existing, respective-platform code bases. Both also offer laziness when needed.

A story that is making the rounds in F# circles chronicles that it took 7K lines of F# code to essentially do what took 180K lines in C#. See:

Re: This is nonsense by Jeff Dickey

I tried to reply to this, several times, and got hung up by the 1994-era Web form that insists verified links are invalid. So I wrote my reply as a blog post.

Re: Haskell is definitely expressive! by Roopesh Shenoy

awesome! Thanks for the link!

CoffeeScript really is incredible by Ivan L

I found with careful thinking my code as about 50% of comparable javascript and typically performs better.

Re: This is nonsense by Mark Pawelek

Abel should be laughing at himself for being conned by quackery.

Racket and Scheme are, essentially, the same language. To a lesser extent so are Lisp, Clojure, and Emacs Lisp. Why are they positioned so far apart?

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

12 Discuss