BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Yammer Moving from Scala to Java

Yammer Moving from Scala to Java

This item in japanese

Lire ce contenu en français

An e-mail, sent from Yammer employee Coda Hale to Scala's commercial management at Typesafe, ended up being leaked via YCombinator and a gist at GitHub. The e-mail confirms that Yammer is moving its basic infrastructure stack from Scala back to Java, owing to issues with complexity and performance.

Yammer PR Shelley Risk confirmed to InfoQ that the e-mail represented the personal opinions of Coda Hale, rather than an official statement from Yammer itself; a follow up from the original author has been published at http://codahale.com/the-rest-of-the-story/. In it, Coda clarified that the message was a result of a request for feedback from Donald Fischer (CEO of Typesafe) following an earlier tweet indicating the move.

Update: Code has published Yammer's official position on the subject; which confirms the above points. It also points out that any language has flaws (not just Scala) and that the e-mail was an attempt at offering advice for how to improve Scala's performance and other concerns. Finally, it concluded that when rolling out any high performance project (for which Scala is their production environment) there are rough edges which need to be filed down; the e-mail was an attempt at helping Scala improve.

Although the e-mail was not meant to be publicly shared, Coda put it on GitHub via a Gist (since deleted) to get feedback from other friends; however, the content was then subsequently shared and then reported more widely.

Back in August 2010, Coda said on the Yammer Engineering blog that they were moving to Scala for their realtime future. The goal was to continue running on the JVM (for performance reasons) and that the conversion had resulted in approximately a 50% code reduction:

Our initial prototype of Artie was in Java, but as a weekend experiment I tried reimplementing it in Scala 2.8. After a day, I had dropped about half the lines of code and added several tricky features. I was sold. It might be easier to hire Java developers, but a Scala team will be able to get a lot more done

Fast forward a year and a quarter later, and the decision is being reversed:

Right now at Yammer we're moving our basic infrastructure stack over to Java, and keeping Scala support around in the form of façades and legacy libraries. It's not a hurried process and we're just starting out on it, but it's been a long time coming. The essence of it is that the friction and complexity that comes with using Scala instead of Java isn't offset by enough productivity benefit or reduction of maintenance burden for it to make sense as our default language. We'll still have Scala in production, probably in perpetuity, but going forward our main development target will be Java.

Stephen Colebourne, who recently posted the thread on Is Scala the new EJB2? has annotated the mail with a number of bullet points, summarising the issues involved:

  • Scala, as a language, has some profoundly interesting ideas in it. But it's also a very complex language.
  • In addition to the concepts and specific implementations that Scala introduces, there is also a cultural layer of what it means to write idiomatic Scala … at some point a best practice emerged: ignore the community entirely.
  • In hindsight, I definitely underestimated both the difficulty and importance of learning (and teaching) Scala. Because it's effectively impossible to hire people with prior Scala experience, this matters much more than it might otherwise.
  • Adding to the unease in development were issues with the build toolchain. … This emphasis on SBT being the one true way has meant the marginalization of Maven and Ant -- the two main build tools in the Java ecosystem.
  • Each major Scala release being incompatible with the previous one biases Scala developers towards newer libraries and promotes wheel-reinventing in the general ecosystem.
  • Via profiling and examining the bytecode we managed to get a 100x improvement by adopting some simple rules:
    • Don't ever use a for-loop
    • Don't ever use scala.collection.mutable
    • Don't ever use scala.collection.immutable
    • Always use private[this]
    • Avoid closures
  • I broached this issue [moving back to Java] with the team, demo'd the two codebases, and was actually surprised by the rather immediate consensus on switching. There's definitely aspects of Scala we'll miss, but it's not enough to keep us around.

Some of these issues are likely to be circumstantial (for example, the ease of hiring a developer with existing experience increases the longer a language is popular), there are some which can be empirically tested. For example, one of the pieces of advice is to avoid for loops. This can be tested with the following piece of code:

scala>
  var start = System.currentTimeMillis();
  var total = 0;for(i <- 0 until 100000) { total += i };
  var end = System.currentTimeMillis();
  println(end-start);
  println(total);
114
scala>
scala< 
  var start = System.currentTimeMillis();
  var total = 0;var i=0;while(i < 100000) { i=i+1;total += i };
  var end = System.currentTimeMillis();
  println(end-start);
  println(total);
8

Using the for loop with an 'until' pattern here (which many Scala programmers would consider idiomatic) can be seen to be significantly slower than the corresponding while loop, even if the code is less readable. The corresponding Java implementation of the same loop shows a result of 2ms for both the for and while loops.

Another test we can perform is the performance of the mutable map by loading in a data set consisting of Integer objects. (This can be compared in Java and Scala and the cost of boxing should be equivalent.):

scala>
  val m = new scala.collection.mutable.HashMap[Int,Int]; 
  var i = 0;
  var start = System.currentTimeMillis();
  while(i<100000) { i=i+1;m.put(i,i);};
  var end = System.currentTimeMillis();
  println(end-start);
  println(m.size)
101
scala>
  val m = new java.util.HashMap[Int,Int]; 
  var i = 0;
  var start = System.currentTimeMillis();
  while(i<100000) { i=i+1;m.put(i,i);};
  var end = System.currentTimeMillis();
  println(end-start);
  println(m.size)
28
scala>
  val m = new java.util.concurrent.ConcurrentHashMap[Int,Int]; 
  var i = 0;
  var start = System.currentTimeMillis();
  while(i<100000) { i=i+1;m.put(i,i);};
  var end = System.currentTimeMillis();
  println(end-start);
  println(m.size)
55

Compared against the vanilla Java code, performance is identical when comparing the java.util.HashMap, and the Java implementation with java.util.concurrent.ConcurrentHashMap is twice as fast as its Scala counterpart. Both of the Java collection classes outperform the Scala counterpart, however. (Timings taken on OSX JVM 1.6.0_29 and Scala 2.9.1, the latest at the time of writing.)

Unfortunately, the Scala collections are pervasive in the Scala library APIs, and as such, they may be promoted from the Java object types to the Scala object types through implicits in the code. According to the migration mail, this resulted in significant re-writing for performance reasons.

Performance of closures (lambdas) may be improved if the Scala compiler generates code with invokedynamic; something that might happen in future versions of the Scala compiler. In addition, in JDK 8 (which will bring both native lambdas and method handles to Java ) has a number of performance advantages which a future Scala version may be able to take advantage of.

Finally, there is increasing pressure for Scala to fix its backward compatibility between releases (rather than just in the minor releases between 2.9.2 and 2.9.3). There has been no official announcement from Typesafe regarding the future roadmap on Scala, or when a stable compiled binary format will permit code to be backwardly (or forwardly) compatible between releases. Having a backward compatible format would enable for more stable libraries to be released and build a community repository, which would help anyone interested in building upon Scala for the future.

Rate this Article

Adoption
Style

BT