Rubinius 1.1 - and the Future of the GIL
Rubinius 1.1 is out (downloads at the Rubinius website or the GitHub repo, or just use RVM).
The Rubinius 1.1 release notes show a long list of improvements and bug fixes, but also some useful additions.
Rubinius has long had a powerful, fast debugger; 1.1 adds some new debugging features (from the release notes):
- Add Debugger APIs and reference CLI debugger
- Add heapdump capability for memory debugging
- Add code to detect bad extensions. Recompile your extensions.
- Add 'rbx report' and support for VM and ruby crashes
The debugger can be enabled with a command line switch (
-Xdebug), which will throw you straight into the debuggers command line at startup. Another way to use the debugger is via the Debugger API;
"require 'debugger'; calling
Debugger.start in the code will enter the debugger.
The team is busy building documentation for Rubinius; now also available via
rbx docs, which will serve the docs in the browser.
Performance has also received some attention:
- Add automatic object ivar packing (improves memory usage)
- Enable block inlining within the JIT
- Implement a new GIL algorithm to prevent starvation
The new GIL algorithm uses the same ideas as the reworked GIL in Python 3.2. However Rubinius 1.1 still has a GIL.
InfoQ caught up with Evan Phoenix, creator of Rubinius, before the 1.1 release, to ask about the Rubinius Hydra branch, which is supposed to yield a GIL-free VM.
InfoQ: What's your plan for removing the GIL in Rubinius?
[H]ere is the tactic I've taken:
1) Remove the GIL itself. This was fairly simple, it's just a C++ class called to in a few places.
2) Rubinius was already organized around thread-local and shared data structures. I added mutexes to the shared data structures and add locks to most methods. I'm probably overlocking some things, but thats ok for now. I simplified this task using some C++ techniques such as scope locks that lock and unlock automatically when used in a scope.
3) When I introduced the JIT, I had to write code to allow the garbage collector to stop all threads in the system so it could safely GC. This was because the JIT already runs concurrently with executing ruby code. This code is cleanly reused now that the GIL is used.
4) I've been using the RubySpec thread specs to detect and fix thread related crashes. This has worked quite well since the specs exercise a number of behaviors of Threads. Most of the fixes have revolved around making sure Threads are properly started and cleaned up. I've used a combination of mutexes and spinlocks to keep things synchronized.
5) I've now started running the entire RubySpec suite to begin looking for more hangs and crashes.
InfoQ: What's your plan for removing the GIL in Rubinius? Will the GIL be gone completely or might there be areas where it might remain, around extensions or other systems?
It's possible there will be around using extensions, simply because extensions were not designed to be run concurrently. We use a handle system for C extensions to access GC objects, which we can make thread safe. But since C extensions use C libraries and such which might not be thread-safe, we'll likely need to take more precautions.
We'll likely have one lock for all execution of extension methods, or one lock per extension that all extension methods in that extension share. This would achieve isolation of code which we'd expect to access shared data and have a bit better performance characteristics.
The lock-per-extension idea also allows an extension to tell the system it is thread safe already and to omit the lock. This would obviously be an extension the the C-API, but it's something that extension writers have already contacted me about.
Hydra will be merged with one of the future versions of Rubinius, which would leave MRI 1.9.2 as the only Ruby VM with a GIL; JRuby and IronRuby have none and MacRuby dropped the GIL a while ago.
Everyone interested in seeing what the GIL does and how it's implemented in Ruby 1.9.x should take a look at Elise Huard's article on the GIL (GVL).
Ruby's (and Python's) lack of parallel running threads has led to developers looking at event-based and non-blocking I/O solutions as well as distributing the workload across OS processes. A lot of the recent excitement around Node.js' non-blocking nature seems related as well.
However, what happens once Ruby VMs without a GIL and parallel Ruby threads become mainstream - will this trend be reversed?
Python has threads
I don't know about Ruby, but Python has parallel running threads.
Re: Python has threads
If the link you provided addresses that, then I'm sorry I misunderstood you.