BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Q&A with Mark Stoodley, Architect of Eclipse OMR Toolkit for Creating Language Runtimes

Q&A with Mark Stoodley, Architect of Eclipse OMR Toolkit for Creating Language Runtimes

IBM's Eclipse OMR is an open-source virtual machine toolkit to create runtime environments for any language.  The intent is to allow language implementors to reuse the hundreds of developer years that IBM has invested in its Java Runtime both in existing languages such as Ruby, Python, Javascript and so on, and to speed the creation of new languages.

This project recognizes the truly Polyglot nature of the developer world. Other language runtimes can benefit by reusing some of the language components. The belief is that although each of these languages might have a niche, they essentially have the same or similar components. As an example, a brand new language can use the generational Garbage Collector from Java and not have to reinvent the wheel, irrespective of whether the language semantics and expressiveness are completely different from the semantics of the Java language.

InfoQ caught up with Mark Stoodley, architect of the proposal.

InfoQ: IBM has invested hundreds of developer years and millions of dollars in the Java runtime -why give this away? Is IBM not investing significantly in Java any more?

Mark Stoodley: On the contrary, IBM has and is continuing to invest significantly in the Java ecosystem. But we would prefer this large investment to be able to pay dividends for other language communities as well.
The rising prevalence of Cloud Computing platforms like IBM Bluemix and the trend towards microservice based architectures mean we need to focus on a broader cross section of the language polyglot than we needed to even just a few years ago; our customers care about many different languages and so we have to care about many different languages. Leveraging investments efficiently across those language communities is important for IBM, but I don’t think we’re the only ones with this challenge and that’s why we’re working to build the Eclipse OMR community.
Since virtually all modern languages are open source communities, the best way for us to get involved and make a difference is to be open ourselves. So really, Eclipse OMR is about IBM joining the existing open communities that build language runtimes rather than operating in isolation behind closed doors.

InfoQ: Is the Eclipse OMR project aimed to benefit newer languages that have yet to be invented and those not on the developer map yet? Or could it help established languages like Ruby, Python, Javascript and so on?

Mark Stoodley: Both, actually!
Eclipse OMR components integrate into a language runtime by implementing what we’re calling a “language glue layer”. Basically, that’s code that helps OMR components implement the semantics and behaviour of a particular language. We’re trying to keep that glue layer as simple as possible and as direct as possible for expressing a language’s needs for the Eclipse OMR components.
For example, a language runtime needs to teach the OMR GC how to iterate through the values in a memory region (i.e. an object) to find pointers to other memory regions (objects). That process is intricately tied to the object model used by the language runtime, so that gets implemented as part of the language glue for the OMR GC component. The pointers that are reported back are used by the OMR GC to keep those objects alive. Walking the full object graph efficiently on any kind of machine with any number of cores is something the OMR GC component knows how to do.
Getting a simple mark-sweep collector up and running is relatively easy with only a few things to implement. More elaborate collectors like generational GC technology that require write barriers are a bit more work. But we’re trying to find ways to make it easier and we welcome anyone who wants to help make it better! There’s a similar story for each of the components that makes up the Eclipse OMR project.
Filling in the language glue layer typically involves refactoring some code in existing runtimes, or just writing that code from scratch for new runtimes. But it’s code you’d probably have to write anyway if you’re writing your own runtime; you can just write less code and get better capabilities by leveraging the OMR componentry.

InfoQ: Can you provide a technical overview of the Java runtime components and which of these components are ready for reuse today? Which of these components will be ready for resuse in the near future?

Mark Stoodley: IBM has already contributed about 200KLOC of code in a number of different components that can be leveraged today, including:
  1. Garbage Collection (GC)
  2. Platform abstraction layer (port)
  3. Cross platform threading library (thread)
  4. Runtime structures: global and per-thread contexts useful when building runtimes

We just added generational support into the GC component, which brings the main workhorse of our Java GC technology out into the open for other languages to leverage. We’re working on diagnostic tooling to help runtime developers work with and debug language runtime issues. We’re working hard to get our Just In Time (JIT) compiler technology ready to be open sourced later this year. Writing a JIT compiler can be a daunting task, but we’re also working on simpler interfaces to the JIT so that people can get up and running relatively quickly without needing to understand all the finer details that compiler developers worry about on a daily basis.

InfoQ: Why Eclipse for open sourcing this project?

Mark Stoodley: It’s about making sure that everyone has access to our project and the freedom to use the technology whether you’re a student working on a class project, a hacker just trying to do something cool, a company wanting to work with and/or within an open source community, or even someone wanting to build a business with no immediate intentions to give back to the community (though we hope even these people will come around to see the value in contributing back into the community!).
The Eclipse Foundation provides a proven environment and the support infrastructure we need to make sure our intellectual property (IP) is protected and to ensure freedom of access. That environment makes us comfortable doing our development in the open, knowing that our IP is safe and we think others should also have that same level of comfort. Over time, we also expect to benefit from the community of Eclipse projects with real experience building platforms that others can use, extend, and contribute to, and that perfectly matches the aspirations of the Eclipse OMR project.
The Eclipse Foundation was a natural fit for us.

InfoQ: It seems to me that the Java community might stand to gain the least in the short term. Is your hope that the Java runtime might eventually benefit from a potential symbiosis?

Mark Stoodley: Absolutely. The Eclipse OMR project is about sharing runtime technology investment across lots of different languages, including Java, and building a community around the people who implement language runtimes. Having shared, reusable componentry is a great way for this community to share best practices wherever they come from. IBM is helping to seed this community by contributing some of the technology from its Java platform. The Eclipse OMR project is an open project with open contribution rules: anyone is free to contribute and the project committers welcome all kinds of contributions and we aim to expand the number of committers to reflect the communities that are contributing to OMR componentry, including making improvement in existing components or contributing / building completely new components that OMR doesn’t yet include.
We expect other languages to influence many of the OMR components, for example languages that are much more dynamic than Java, which we think, in turn will help Java itself be better for dynamic languages.
So I absolutely think Java has something to gain here, just like other languages have something to gain: nobody has a monopoly on great ideas. But the entire development community stands to benefit when we share those great ideas via reusable runtime technology components.

InfoQ: Do you have evidence of this reuse working in practice? Evidence that this is not merely an academic exercise?

Mark Stoodley: We’ve done some exploratory work to prove to ourselves that the Eclipse OMR components can be used in more than one language: currently Java, Ruby, Python, and a Smalltalk runtime that’s used to teach students how to build runtimes called SOM++. We know it works: we’ve managed to migrate method profiling, garbage collection, and Just In Time compiler technology from Java to Ruby, Python, and SOM++, all of which we will be donating to those communities if they’re interested.
We haven’t done all the work, at this point, though. Our goal was to convince ourselves that the Eclipse OMR concept isn’t doomed from the start. And we overachieved on that goal: we managed to do some pretty cool things even starting with the requirement that we cannot break compatibility with the stock runtime. We brought integrated method profiling support to our Ruby and Python implementations using IBM Health Center tools, establishing that tool APIs can be introduced to OMR and then easily leveraged in multiple language runtime to provide out-of-the the-box monitoring capabilities. We used our Java GC technology to move all the off-heap data from Ruby onto the managed (garbage collected) heap which enables much better control over memory footprint as well as improving performance because managed memory can be allocated and freed much faster than native memory. Finally, we implemented relatively simple Just In Time (JIT) compilers with just a few thousands of lines of code in each runtime to improve performance by up to 2X even though we weren’t doing anything fancy. These JIT compilers can be taken way further than we’ve done to provide even better performance, but we think that work should really be done inside and alongside those communities and we’re looking forward to working more on these kinds of things as the Eclipse OMR project matures.
The proof point we have on compatibility is that our Ruby+OMR Technology Preview, which can be found on GitHub at https://github.com/rubyomr-preview/ruby, can run Rails applications even with the Ruby+OMR JIT enabled. We welcome feedback, so please feel free to drop by and try it out!
The other significant example is the IBM JDK itself. Our J9 development team started the work to distill the Eclipse OMR components from the IBM JDK a while ago and, along the way, we shipped the Java 8 version of the IBM JDK. We didn’t do this work on a side branch; we did the work while our larger development team continued to develop and release IBM JDK 8 with a pretty impressive list of features and virtually across-the-board performance improvements. And we’re continuing to build the next version of the IBM JDK while consuming changes from the Eclipse OMR project on an hourly basis. This is no academic exercise for us: it’s how we build our runtimes.
We also look forward to working with other language communities that are interested to see what the Eclipse OMR components can bring to their runtimes.

 

InfoQ: Which companies or communities are already working on project OMR? Can you talk a little bit about the roadmap?

Mark Stoodley: The Eclipse OMR project is primarily active with IBM developers at the moment, but that’s not because we don’t want others to work with us! The project is still really in its infancy, which unfortunately means it’s missing things like good documentation and a solid onboarding experience. Those are areas we really want to improve upon. In and around improving the core technology and getting all the stuff into the open that we want to get out, we’re going to be building out the documentation and reaching out to others. Anyone who is interested is welcome to drop by the project at https:// github.com/eclipse/omr even if just to tell us what you’d like to be able to do.
We’re currently working on a set of pull requests for the Ruby community to consider accepting OMR components into the next major version of Ruby. Our current patches work off Ruby 2.2.5 and can be viewed at https://github.com/rubyomr-preview/ruby . We’ll be working to bring those patches forward to the master Ruby branch and also to restructure into smaller, more manageable commits that enable specific capabilities via OMR components.
But that’s only what our existing community members are doing. We would welcome anyone who wants to get started with this technology, whether you work on language runtimes, or tools that interface with languages, or platform(s) that languages run on, or other frameworks that could benefit from tighter integration with different languages. It’s true that we’re at the chicken and egg stage right now, but with your and others’ help, we will be expanding and working with more and more language communities to start delivering on the promise of shareable reusable componentry for building runtimes.
I look forward to seeing you at the Eclipse OMR community!

The release of the project was covered in an earlier InfoQ article.

Rate this Article

Adoption
Style

BT