InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

JRuby 1.1RC2 released with reduced memory requirements

Posted by Werner Schuster on Feb 21, 2008

Sections
Development
Topics
Dynamic Languages ,
Java ,
Ruby ,
JRuby
Tags
JRuby
The second Release Candidate for JRuby 1.1 (RC2) has been relased, and it's quite an improvement over RC1:
- 260 issues resolved since JRuby 1.1RC1
- Large IO refactoring
- Memory improvements for JIT'd methods:
   - Control total number of methods JIT'd
   - Support a JIT cache between runtimes to return permgen
   - Reduce codesize of generated methods (50-70% reduction)
Next to the Java port of the Oniguruma Regex engine, the most significant performance improvement of JRuby 1.1 over JRuby 1.0 is the introduction of the Just In Time (JIT) compiler, which compiles Ruby code to JVM bytecodes. However, it also shows the problems that a JVM language implementation has to deal with.

One thing causing problems for JRuby's JIT is the way bytecode is managed in the JVM. The smallest loadable unit of bytecode in the JVM is a class - so if a Ruby method is JITed, the generated code is put into a method body in a new class, which is then loaded. However, this is a potential source of problems and a memory leak: bytecode is loaded into the PermGen, a Garbage Collector generation, which by default is quite small, usually 64 MB. Nick Sieger explains how quickly this could be filled up just with JITed Ruby methods:
Consider a non-trivial Rails application that makes liberal usage of the Ruby standard library, and also uses a handful of plugins, and the number of methods available for JRuby to compile can easily exceed 10,000. If the average overhead of a single JRuby method class is around 8K (varying due to method size, of course), this would occupy up to 80 megabytes of permgen space. (By contrast, the JVM’s default size of the permgen space is 64 megabytes, so we’re already over the limit).
[..]
If you were to deploy 4 Rails applications each with 4 active runtimes into a single application server, you’re looking at almost 1.2 gigabytes of permgen space necessary to run your applications! (Usually, it’s common to run multiple applications in a Java application server, but with Rails applications that may need to be reconsidered.)
This is a very real problem - the PermGen behaves just like the regular Java heap: it has a fixed size, and once the PermGen is full, an OutOfMemory exception is thrown and eventually the JVM is terminated.

Nick Sieger explains the various solutions to this problem in RC2:
Because of this multiplicative cost, shortly after JRuby 1.1RC1 was released we took the somewhat drastic measure of capping the number of methods that each runtime would JIT-compile to 2048. But after a while it became obvious even with a threshold-based approach, JRuby was still wasting a ton of permgen space with duplicate copies of compiled methods. So for 1.1RC2 we introduced a JIT cache that could be set up to be shared among multiple runtimes.

The solution for this problem is already available as Dynamic Methods on the .NET platform. Instead of compiling Ruby methods into Java classes with a single method body, the bytecode would be stored in a method object - with the emphasis on object. These Dynamic Methods behave just like regular objects, which will be Garbage collected once they're not reachable anymore. This approach would also get rid of a lot of other overhead, as John Rose explains:
One pain point in dynamic language implementation is managing code dynamically. While implementor’s focus is on the body of a method, and the linkage of that body to some desired calling sequence, there is a host of surrounding details required by the JVM to properly place that code. These details include:
  • method name
  • enclosing class name
  • various access restrictions relative to other named entities
  • class loader and protection domain
  • linkage and initialization state
  • placement in class hierarchy (even if the class is never instantiated)

These details add noise to the implementor’s task, and often enough they cause various execution overheads. Because class of a given name (and class loader) must be defined exactly once, and must afterwards be recoverable only from its name (via Class.forName) the JVM must connect each newly-defined class to its defining class loader and to a data structure called the system dictionary, which will handle later linkage requests. These connections take time to make, especially since they must grab various system locks. They also make it much harder for the GC to collect unused code.
Of course, a feature like .NET's Dynamic Methods is not available on the JVM. Research is going on in the Da Vinci Machine project, with prototypes available, but it remains to be seen when a feature like that will make it into the next Java release.

No comments

Watch Thread Reply

Educational Content

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.