| by Werner Schuster 6 Followers on May 23, 2007. Estimated reading time: 8 minutes |

A note to our readers: You asked so we have developed a set of features that allow you to reduce the noise: you can get email and web notifications for topics you are interested in. Learn more about our new features.

A recent interview with Matz (Yukihiro Matsumoto), creator of Ruby, and Sasada Koichi, creator of YARV, tackles the topic of Ruby's handling of threads. Current stable releases of Ruby use user space threads (also called "green threads"), which means that the Ruby interpreter takes care of everything to do with threads. This is in contrast to kernel threads, where the creation, scheduling and synchronization is done with OS syscalls, which makes these operations costly, at least compared to their equivalents in user space threads. User space threads, on the other hand, can not make use of multiple cores or multiple CPUs (because the OS doesn't know about them and thus can't schedule them on these cores/CPUs).

Ruby 1.9 has recently integrated YARV as the new Ruby VM, which, among other changes, has brought  kernel threads to Ruby. The introduction of kernel threads (or "native threads") was widely greeted, particularly from developers coming from Java or .NET where kernel threads are the norm. However, there's a snag. Sasada Koichi explains:
As you know, YARV support native thread. It means that you can run each  Ruby thread on each native thread concurrently.

It doesn't mean that every Ruby thread runs in parallel. YARV has  global VM lock (global interpreter lock) which only one running Ruby thread has. This decision maybe makes us happy because we can run most of the extensions written in C without any modifications.
This means: no matter how many cores or CPUs are available, only one Ruby thread will be able to run at any given time. There are workarounds and native extensions can handle the Global Interpreter Lock (GIL) in more flexible ways, for instance, release it before starting a long running operation. Sasada Koichi explains the API available for releasing the GIL:
You must release Giant VM Lock before doing blocking task.  If you need do this in extension libraries, use rb_thread_blocking_region() API.

blocking_func, /* function that that will block */
data,          /* this will be passed above function */
this function is called to unblock or NULL
)

The problem: this effectively removes the biggest argument for kernel threads, the use of multiple cores or CPUs, while retaining their problems.

Kernel threads are also the reason why Continuations might be removed in future Ruby versions. Continuations are a way for cooperative scheduling, which means that one thread of execution explicitly hands off control to another one. The feature is also known under the name "Coroutine", and has been around for a long time. Recently, it move into the public eye because of the Smalltalkbased web framework Seaside, which uses Continuations to significantly simplify web apps.

The  approach using Kernel threads with a GIL is  comparable to Python's thread system, which also uses a GIL, and has done so for a long time. Python's GIL has caused countless debates about how to remove it, but it has stuck around for all this time.

However, a look at Guido van Rossum, Python's creator, thoughts about threads, gives a view of an alternative future for Ruby threading. In a recent post about the GIL, Guido van Rossum explained:
Nevertheless, you're right the GIL is not as bad as you would initially think: you just have to undo the brainwashing you got from Windows and Java proponents who seem to consider threads as the only way to approach concurrent activities.

Just because Java was once aimed at a set-top box OS that didn't support multiple address spaces, and just because process creation in Windows used to be slow as a dog, doesn't mean that multiple processes (with judicious use of IPC) aren't a much better approach to writing apps for multi-CPU boxes than threads.

Just Say No to the combined evils of locking, deadlocks, lock granularity, livelocks, nondeterminism and race conditions.
The benefits of preemptively scheduled threads that share an address space has long been debated. Unix was, for the longest time, single threaded or user space threaded. Parallelism was implemented with multiple processes which communicated via different means of InterProcess Communication (IPC), such as Pipes, FIFOs, or explicitely shared memory regions. This was supported by the fork syscall, which allowed to cheaply duplicate a running process.

Recently, languages such as Erlang have gained a lot of interest by also using a share ­nothing approach (called "lightweight processes) + easy IPC method. The "lightweight processes" are not OS processes, but actually live inside the same address space. They are called "processes", because they can not look into each others memory areas. The "lightweight" comes from the fact that they are handled by a userspace scheduler. For a long time, this meant that Erlang had the same problems as other userspace threaded systems: no support for multicores or multiple CPUs and blocking syscalls would block all threads. Recently, though, this was solved by adopting an m:n approach: the Erlang runtime now uses multiple kernel threads, and each one runs a user space scheduler. This means that Erlang now gets to take advantage of multicores and multiple CPUs, without changing its execution model.

Luckily for the Ruby space, the Ruby team is aware of this and is considering this future for Ruby:
[...] if we have multiple VM instance on a process, these VMs can be run in parallel. I'll work on that theme in the near future (as my research topic).

[...] if there are many many problems on native threads, I'll implement green thread. As you know, it's has some benefit against native thread (lightweight thread creation, etc). It  will be lovely hack (FYI. my graduation thesis is to implement userlevel thread library on our specific SMT CPU).
This indicates that userspace (green) threads version of Ruby is not off the table, particularly in light of  implementation problems of threading systems on different OSes, such as this one:
Programming on native thread has own difficulty. For example, on MacOSX, exec() doesn't work (cause exception) if other threads are running (one of portability problem). If we find critical problems on native thread, I will make green thread version on trunk (YARV).
Why is there a need for Sasada Koichi's Multiple VM (MVM) solution? Running multiple Ruby interpreters and having them communicate via IPC methods (e.g. sockets) is possible today as well. However,  it comes with a host of problems:
• The Ruby process needs to exec a new Ruby interpreter, which means it needs to know how it was launched (which Ruby executable to use). This quickly becomes difficult to do in a portable way. For instance: if JRuby is used, the executable needs to be "jruby". Worse: the JVM or application server running it might not allow running outside programs.
• The new Ruby interpreter needs to be set up with the correct ENV variables, LOADPATHs, Include Paths, and the main .rb file to execute
• Communication can happen via DRb, but this needs to go via the network which is the only portable means of IPC.
• Network communication means negotiating ports (which port should the "server" part of the two programs listen to).
• Network communication also means potential problems with firewalls that complain about programs opening connections or opening ports.
Of course, these issues make  this much more complicated than the Thread equivalent of firing of a new thread of execution:
x = Thread.new{    p "hello"}
Or this Erlang sample:
pid_x =  spawn(Node, Mod, Func, Args)
This Erlang code spawns a new lightweight process, and indeed: this is all the code that's needed. All the set up code is taken care of, none of the problems explained above.
The pid is a handle to the new process, and allows, for instance, for simple communication:
pid_x ! a_message
This sends a simple message to the process with the pid store in pid_x. The message can consist of various types, for instance Atoms, Erlang's version of Ruby's symbols.

IPC as simple as this is certainly possible in Ruby too. Erlectricity, is a new library that permits communication between Erlang and Ruby, but it could just as well be used to work between Ruby VMs. Erlang IPC is particularly interesting, as it uses a pattern matching approach that facilitates message passing and makes it very concise.

The Ruby MVM is certainly the most promising idea for the future of Ruby threading. It avoids the problems of the GIL and of manually wrangling Ruby processes and uses the share nothing ideas that make Erlang and other systems appealing for concurrency.

JRuby is the only Ruby version that uses kernel threads, mostly because it's running on the JVM which supports them. The cost of creating kernel threads is somewhat offset by the use of thread pools (threads are created and kept around until they are needed). Details of IronRuby's threading support aren't known yet, but since the CLR and JVM are quite similar, it's likely that kernel threads will be used too.

One possibility to prototype and experiment with the idea of a Ruby MVM would be to launch multiple instances of JRuby in the same JVM process and have them communicate. This would effectively have the same cheap IPC (data can be passed simply by passing a pointer, as long as the data is read only).

Ola Bini recently wrote about his new jrubysrv idea, which allows to run multiple JRuby instances in one JVM to save memory.

As it seems, the details of future thread support in Ruby are still undecided and might be quite different in the alternative implementations.

Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Processes

I like Guido's ideas, and it certainly would work for concurrency in the large.

However, if you look at the micro sort of parallelism like fortress has, or haskell can do (where even small operations are parallel) IPC is really not suitable for that (think of a multicore CPU for instance).

Ruby is a nice high level language, I would like to see threads come in in a high level way (if possible) not just do what other languages do for the sake of completeness.

Memory Model

If they are going to use multiple cores, the chance of visibility issues is going to increase as well. It took a long time before Java its memory model was fixed (and a lot of very smart people worked in it). How are these issues going to be tackled in Ruby?

technical depth

Now here's a level of technical coverage I wouldn't have expected from infoq. Good thing though, these things are important when considering the future of Ruby (including IronRuby and JRuby) for enterprise development. Good work!

Great article

I'll second what Stefan said.

Second that

Me too. Great article!

Rubinius

Notably missing from the listed Ruby implementations is Rubinius, which as I understand it will support a wide variety of threading models, including green threads and the Erlang lightweight-process model. MenTaLguY is, I believe, pushing this forward, and considering porting his work over to some of the other implementations, too.

Hanging around in #rubinius while MenTaLguY is around will probably yield more enlightenment.

comprehensive coverage but..

It seems to be one of *the* most comprehensive coverage of threading in Ruby and the latest trends/research.
But the problem I figured out is that there is no pattern of thought process developing or research heading somewhere concrete.
Although Ruby MVM is posed as the best option available but still issues with that approach are mentioned. Also any successful work on that is hard to find.

It seems threading in Ruby will remain experimental in nature and may only improve with the advent of some other language that has a solid implementation of the threading experiments done in Ruby side.

Re: Rubinius

Antonio: yes, you're right. Some Rubinius coverage would be good, especially now that Evan Phoenix is paid for his Rubinius work. The current state of Rubinius threading seems to be userspace threads, with some more ideas/concepts on the horizon.

Re: comprehensive coverage but..

Ismail: Ruby's threading is not experimental in Ruby 1.8.x and JRuby 1.0, one has userspace threads the other one has kernel threads. The implementations are solid and their issues are known.
Ruby 1.9.x is a bit of a wildcard right now, but that's alright since it's not a release yet, actually I think it's not even an 'alpha' or something like that.

If you want a "solid implementation", then just use the existing Ruby 1.8.x or JRuby. If you want a "new" language with a mature multiprogramming story, take a peek at Erlang. Erlang has been used for some for some highly scalable and rock solid apps.
Also: as for using only "solid implementations" ... Java's memory model (which is crucial for threading) was broken for the first 10 years of it's existence (it was fixed in 1.5), yet this didn't have a bad impact on the success Java or it's applications.

Interesting topic

I find this topic to be very interesting. Since Ruby 1.9, JRuby and Rubinius have a lot of leeway to experiment with new features like MxN threading or Erlang style multiprocess concurrency I am confident good things are one the horizon. I think it is particularly important not to fall into the trap of thinking that native threads will be a concurrency panacea for Ruby (or any other language). Erlang definitely seems to have the right idea.

Why not use fork()

"The Ruby process needs to exec a new Ruby interpreter, ..."

It doesn't seem like that is necessary if you just use fork()?

sRp

Oz/mozart is another language with super light weight internal threading, very much like erlangs light weight threading. They've also gone the route of no smp, use multiple processes. At the moment, i think the main disadvantage of such is that it is slightly harder to take advantage of multiple processors/cores. For instance, instead of running a spawn for every element in a list and collecting the results, you'd need to include logic to distribute the spawns to different processes and possibly even worry about dealing with a single process crashing. Most of this could be solved by having a good set of libraries to make the default case easy. The advantage is that it should make it much easier in the future to take fuller advantage of things like NUMA/SUMA/SUMO without having to deal with memory locality issues and expensive barriers. Such libraries should also be beneficial for making some very closely networked machines (such as blades) function together.

sRp
Close

by

on

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

12 Discuss

Login to InfoQ to interact with what matters most to you.