InfoQ

News

JRuby 1.1RC2 released with reduced memory requirements

Posted by Werner Schuster on Feb 21, 2008 09:00 AM

Community
Java,
Ruby
Topics
JRuby ,
Dynamic Languages
Tags
JRuby
The second Release Candidate for JRuby 1.1 (RC2) has been relased, and it's quite an improvement over RC1:
- 260 issues resolved since JRuby 1.1RC1
- Large IO refactoring
- Memory improvements for JIT'd methods:
   - Control total number of methods JIT'd
   - Support a JIT cache between runtimes to return permgen
   - Reduce codesize of generated methods (50-70% reduction)
Next to the Java port of the Oniguruma Regex engine, the most significant performance improvement of JRuby 1.1 over JRuby 1.0 is the introduction of the Just In Time (JIT) compiler, which compiles Ruby code to JVM bytecodes. However, it also shows the problems that a JVM language implementation has to deal with.

One thing causing problems for JRuby's JIT is the way bytecode is managed in the JVM. The smallest loadable unit of bytecode in the JVM is a class - so if a Ruby method is JITed, the generated code is put into a method body in a new class, which is then loaded. However, this is a potential source of problems and a memory leak: bytecode is loaded into the PermGen, a Garbage Collector generation, which by default is quite small, usually 64 MB. Nick Sieger explains how quickly this could be filled up just with JITed Ruby methods:
Consider a non-trivial Rails application that makes liberal usage of the Ruby standard library, and also uses a handful of plugins, and the number of methods available for JRuby to compile can easily exceed 10,000. If the average overhead of a single JRuby method class is around 8K (varying due to method size, of course), this would occupy up to 80 megabytes of permgen space. (By contrast, the JVM’s default size of the permgen space is 64 megabytes, so we’re already over the limit).
[..]
If you were to deploy 4 Rails applications each with 4 active runtimes into a single application server, you’re looking at almost 1.2 gigabytes of permgen space necessary to run your applications! (Usually, it’s common to run multiple applications in a Java application server, but with Rails applications that may need to be reconsidered.)
This is a very real problem - the PermGen behaves just like the regular Java heap: it has a fixed size, and once the PermGen is full, an OutOfMemory exception is thrown and eventually the JVM is terminated.

Nick Sieger explains the various solutions to this problem in RC2:
Because of this multiplicative cost, shortly after JRuby 1.1RC1 was released we took the somewhat drastic measure of capping the number of methods that each runtime would JIT-compile to 2048. But after a while it became obvious even with a threshold-based approach, JRuby was still wasting a ton of permgen space with duplicate copies of compiled methods. So for 1.1RC2 we introduced a JIT cache that could be set up to be shared among multiple runtimes.

The solution for this problem is already available as Dynamic Methods on the .NET platform. Instead of compiling Ruby methods into Java classes with a single method body, the bytecode would be stored in a method object - with the emphasis on object. These Dynamic Methods behave just like regular objects, which will be Garbage collected once they're not reachable anymore. This approach would also get rid of a lot of other overhead, as John Rose explains:
One pain point in dynamic language implementation is managing code dynamically. While implementor’s focus is on the body of a method, and the linkage of that body to some desired calling sequence, there is a host of surrounding details required by the JVM to properly place that code. These details include:
  • method name
  • enclosing class name
  • various access restrictions relative to other named entities
  • class loader and protection domain
  • linkage and initialization state
  • placement in class hierarchy (even if the class is never instantiated)

These details add noise to the implementor’s task, and often enough they cause various execution overheads. Because class of a given name (and class loader) must be defined exactly once, and must afterwards be recoverable only from its name (via Class.forName) the JVM must connect each newly-defined class to its defining class loader and to a data structure called the system dictionary, which will handle later linkage requests. These connections take time to make, especially since they must grab various system locks. They also make it much harder for the GC to collect unused code.
Of course, a feature like .NET's Dynamic Methods is not available on the JVM. Research is going on in the Da Vinci Machine project, with prototypes available, but it remains to be seen when a feature like that will make it into the next Java release.

No comments

Watch Thread Reply

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.