XRuby: Another Approach to Ruby on the JVM
The JRuby project has made great strides towards support and compatibility for (most) of Ruby. At the same time, a lot of work was done in the performance area, causing improvements in JRuby's speed.
Yet, recently news of XRuby being faster than Ruby 1.8.5 in most benchmarks made the rounds. The results show some moderate speed increases, and some massive ones in areas of method invocation and array creation. And - according to the XRuby blog:
Maybe I should mention that the XRuby team had done virtually nothing for performance before, and we would avoid optimization as long as possible if it makes our code complicated.
XRuby is a compiler that takes in Ruby source files and turns them into Java bytecode (.class files that are bundled into a .jar file). This is a different approach to JRuby, which aims to work just as the Ruby interpreter, i.e. it's possible to run some code with
Xue Yong Zhi, of the XRuby project, starts of with an answer to the big pressing question that comes for every Ruby implementation: When will it run Rails?
As for Rails, we'd like to see it to run on XRuby at the end of year. It is hard to predict as all of our members are part-timers. Among all alternative ruby implementations, we are the one with least resource and probably the least ambitious. The project is pretty much a research and learning tool for us, and we have fun to do some pioneering work to compile ruby to java bytecode.
A big problem for every alternative Ruby VM implementation, is access to standard libraries. The JRuby team has spent a lot of time getting libraries such as SSL support (a massive effort tackled by JRuby team member Ola Bini). The issue is that many of these libraries are implemented in C, and are thus not useable in alternative Ruby implementations. Xue Yong Zhi again:
[...]built-in libraries are our biggest problem. All the other parts, like parser, compiler, runtime etc are in very good shape, but we lag behind in implementing built-in. And one of the reason may be it is not as challenging/cool as doing compiler:) - although we do understand how important it is.
If it is easy for us to reuse code form other projects, it is definitely a great thing. And personally I like rubinius's approach better as they are implementing most builtin libraries in pure ruby. This summer Google will sponsor two students to work on RSpec suite for Ruby implementations, I hope it will help rubinius and other projects.Rubinius is a project headed by Evan Phoenix, and aims to write a complete Ruby implementation in Ruby. The specific ideas for Rubinius come from Squeak, a Smalltalk that is implemented in Squeak. The advantage of this approach is clear: if every piece of the standard library is written in Ruby, the same code can be used on any Ruby implementation. Currently, systems such as JRuby, XRuby and other Ruby implementations such as Ruby.NET need to carefully figure out the behavior of Ruby and implement this in their implementation languages.
An example: Ruby uses a library called PCRE as it's implementation of regular expressions. Until recently, JRuby used
java.util.regex that is shipped with Java for Ruby regular expressions. However, it became clear that there are significant differences between Java
java.util.regex and PCRE, and so, a new Java library for handling Ruby regular expressions was started by Ola Bini. This is unfortunate; a better solution would be to implement the regular expression functionality in Ruby. With this, the exact same code code could be used across all implementations of the Ruby runtime, and there would be no need to carefully reverse engineer the behavior over and over.
Of course, there's a barrier to this: implementing the library in Ruby would mean lower performance, as current Ruby runtimes are rather slow. However: here is an idea making use of both JRuby and XRuby. If XRuby can continue turning Ruby code into high speed Java bytecode, it could be used as accelerator for performance critical code.
XRuby could potentially offer a solution in this case, at least when the JRuby is used as a runtime. The performance critical code could be compiled to Java bytecodes with XRuby, which hopefully results in much faster code. The resulting Java bytecode can be used with JRuby, since it's possible and very easy to access existing Java code from JRuby (after all, all the Java standard library is usable from JRuby). The big benefits of this
- the Ruby code remains the sole code base for the logic, no re-implementation necessary
- at deployment time, the faster performance of XRuby compiled code can be used
- at development time, the advantages of Ruby are retained (code changes at runtime, higher level coding with Ruby, etc)
This isn't just a problem for system libraries. Some time ago, David Heinemeier Hansson posted an article on rewriting some badly performing, yet performance critical Ruby code in C . It had turned out, that some Ruby code had become a bottleneck for an application. The solution was to rewrite the algorithm in C, and use Ruby's native extension system to use the resulting library. This is similar to the problem of having a regular expression library written in C.
Just as with the Ruby regular expression library, this is a messy solution. It means that the Ruby code must be dropped, and the C based reimplementation is developed from now on. Of course, a Ruby version could still be updated and kept in sync with the C code (or vice versa), but this means duplicated effort and the potential for subtle bugs between the implementations. Of course, the bigger problem is that the C code limits portability - suddenly a part of the code is not usable in JRuby or any of the other systems. The above described solution could help here too.
Whether this is actually a solution depends on how far the XRuby team gets and how big an edge it can keep over the JRuby compilation work. But there's a good chance for this. JRuby's Charles O. Nutter:
[...], the two runtimes are quite different, and I believe we've taken a different approach to compilation since JRuby will likely always have an interpreted mode. You can do things differently when you plan to compile all the time.
There remain problems to bring the two JVM based Ruby systems together. Both use their own implementations of the runtime, and the internals are quite different. The problems start at the base: XRuby represents a Ruby object as with the Java class
com.xruby.runtime.lang.RubyObject, JRuby as
org.jruby.RubyObject. This means that some JRuby code that wants to pass an object reference to some method compiled with XRuby will have a problem. And this is just the beginning. All this means, that - at least now - seamless interaction between JRuby and XRuby is hampered by many barriers.
However, XRuby already has potential to support JRuby in other areas. A very impressive feature of XRuby is their implementation of a Ruby parser using the ANTLR parser generator. Although JRuby - obviously - has a parser, there are problems. Ola Bini:
The big problem with it is the maintainability aspect. It's hard code, and especially the lexer is not easy to follow. We also have some undesireable complications to handle source positions correctly. From that perspective, a fully functional ANTLR parser would probably be preferable to ours, even if it had slightly worse performance. We're planning on investigate this closer, but not until after 1.0 has been released.Charles O. Nutter weighs in as well:
Ola mentioned we're not really concerned about performance, but we're always open to the possibility of moving to ANTLR at some point in the future. If it would make it easier for us to adapt to Ruby 2.0 changes and eliminate our dependence on an external tool (Jay, the parser generator), it would probably be worth it. I've also designed the compiler so that we could swap in a completely different parsed representation without too much effort. Maybe post 1.0 we'll look at that.Xue Yong Zhi, of the XRuby project, has some even more exciting news:
Our current one is ANTLR 2.7.x based, and it is fast enough. Once we move it to ANTLR 3.0, the speed is going to be much better. Google has approved a student(Haofei)'s SOC application to work on this, and other memeber (Femtowin) is willing to help as well. If everything goes well, we may even produce a ruby parser in ruby as there will be a ruby backend in ANTLR 3.0. The biggest advantage of a ANTLR based parser is the low maintenance cost.The Ruby parser in Ruby would be a very welcome addition to the Ruby tool set: until now, there is no Ruby parser written in Ruby. This is a big obstacle to Ruby language tools, because right now, there is no Ruby runtime independent way to get an Abstract Syntax Tree (AST, a data structure representing source code). In Ruby 1.8.5, a native extension is necessary; in a JRuby specific AST can be fetched, but this obviously needs JRuby-specific code. (Details for the Google Summer of Code (SoC) project mentioned).
In conclusion, the two projects are moving ahead at great speed. While XRuby still has to catch up in many aspects, it seems that it will be an important influence in the future, and definitely a project to watch.
PCRE and JRuby compiler
Two small points, though; first of all, MRI (Matz Ruby Implementation) does not use PCRE at all, but their own version of an older version of the Perl regular expression engine (development versions, 1.9.*-series, use Oniguruma, which isn't PCRE either). The points you make about PCRE are still valid, though, if you substitute "PCRE" for "The Ruby Regexp engine implementation".
Secondly, JRuby also have a compiler that turns Ruby source files into Java class-files or jar-files. It doesn't handle all nodes yet, but the same compiler can be used for both compiling whole files, and just-in-time-compiling hotspots in code, which makes the JRuby approach quite flexible. Further, the fact that we have a basically complete interpreter means that we can implement the compiler piecemal, and it will have a good effect for the cases it covers, but everything else will also be functional.
Re: PCRE and JRuby compiler
So far our compiler appears to outperform XRuby's, though we've done a lot more work in the performance department and our compiler is nowhere near as complete. In certain cases, we even outperform the new Ruby 1.9 bytecode implementation on its own tests. But the bottom line is that our official plan is for JRuby to support mixed-mode execution, so that all input code can eventually run as fast as Java bytecode.
I have also run some tests with our compiler comparing compiled Ruby code with hand-written JRuby Java code. It turns out that with the compiler in its current state, compiled Ruby is only about 20% slower than hand-written JRuby-runtime Java, so we may soon be able to rewrite portions of JRuby in Ruby or reuse the code from the Rubinius and MetaRuby projects. We are actively working toward that goal.
(Note that this doesn't mean our compiler makes Ruby run as fast as Java...just that it's no slower than our JRuby-internal implementations of Ruby features)
Congratulations to the XRuby guys on getting 0.1.4 released, and I hope they'll continue to have success in the future.
Re: PCRE and JRuby compiler
Damn, seems like I mixed up some information about Ruby and convinced myself that it uses PCRE.
Thanks for the clarification Ola!
And then there is ...