InfoQ

News

Ruby Userspace Threads vs GUI toolkits Roundup

Posted by Werner Schuster on Apr 11, 2007 10:00 AM

Community
Ruby
Topics
Programming
Tags
Concurrency,
SmallTalk,
GUI,
Rubinius,
JRuby,
Threading

Blogger Ciaran McCreesh seems concerned with the lack of kernel threads in Ruby and ponders consequences for GUI applications:

A simple example: Consider a Gtk2 GUI application with a button and a text box. When the user presses the button, the application goes away and does some calculations using a third party library. When it is done, it displays the results in the text box. To prevent confusion, the button should be greyed out whilst the calculations are in progress. The naive implementation has a problem: whilst the calculations are taking place, the GUI can't respond to events. If, for example, the user switches away from and back to the window, the window won't get redrawn and it will look like the application has hung. The usual solution is to use threading. There are various ways of doing this and handling the communication, some more elegant than others. In any case, a child thread ends up doing the calculation whilst the parent thread remains free to handle GUI events. Except that with Ruby, this isn't an option. Using operating system threads (e.g. pthreads or Gdk threads) makes the Ruby interpreter explode. And when using Ruby threads, the thread handling the GUI won't get called so long as the child thread is busy in third party code.

 

While the lack of kernel threads (threads scheduled and handled by the OS) can be a problem, it is not an unsolvable. I/O is the big issue, particularly network access, since it can potentially block for long times. If this happens on the GUI event handling thread, it will leave the GUI unresponsive for the duration. That problem is solvable by using non-blocking I/O with the select call. select allows to specify timeouts for I/O operations, such as waiting for input from a socket. If no data has arrived before the timeout, the call returns and the code is free to yield to other threads, for instance the thread handling GUI events. Later on the I/O operation can be tried again. By using short timeouts, this acts like time slicing mechanisms that thread schedulers use.

As a matter of fact, many Smalltalk implementations like Squeak or Cincom VisualWorks also use user-space threads, and their proponents see it as an advantage. Cincom's Product Manager James Robertson expresses his surprise at the plans for Ruby 2.0 to have kernel threads (Note: he refers to user-space threads as green threads, to kernel threads as native threads):

In VisualWorks, threads are green. This makes them really, really inexpensive. For instance, in BottomFeeder, I spawn a thread per feed (I'm subscribing to 309 as I write this). I don't need to worry about thread pools, or overwhelming the platform. If I tried such a feat with native threads, I'd have to worry about those things.
There are a few remaining problems, as pointed out in the comments. One example is a system call like DNS lookup, which cannot be worked around with non-blocking I/O methods. James Robertson in the comments of the above article:
At present, DNS lookups do, in fact, block at the VM level. We are implementing an asynch client for in order to deal with that OS level limitation.
As an aside, Rubinius has added first support for user-space threads. Rubinius is a Ruby runtime implementation that draws heavily on the design behind Squeak and the "Blue Book", a description of the Smalltalk-80 system.

 

Other options are to use the OS scheduler to handle this by starting multiple Ruby processes, which has the benefit of exploiting multiple CPUs or Cores if they are available (the OS can schedule the runtimes to available CPUs or Cores). The runtimes can interact via appropriate methods of Inter-Process Communication (IPC). Ruby ships with libraries such as DRb that make communication easy. Using sockets makes it possible to use non-blocking I/O for communication with other runtimes. By offloading long running tasks to other runtimes, the GUI can be kept responsive. This also helps to enforce decoupling the GUI from the backend of an application, by making every communication between the two explicit and shortcuts impossible. Being single threaded and thus removing the troubles of pre-emptive multithreading - such as synchronization or race conditions - is an idea implemented by languages such as Erlang.

An alternative solution presents itself in JRuby, a Ruby implementation for the Java platform. JVMs have been using kernel threads for many years (the first versions of Java used green threads too). JRuby's solution to handling threads is to map each Ruby thread onto a Java thread, thus reaping the advantages of kernel threads. Since many toolkits have Java bindings, using them from JRuby is an option. JavaGnome allows access to Gnome and GTK+, Qt Jambi(tm) gives access to Trolltech's popular Qt GUI toolkit. Of course, other choices such as SWT, the GUI toolkit behind Eclipse and BitTorrent client Azureus, are possible as well.

6 comments

Reply

YARV? by Tom Nichols Posted Apr 11, 2007 12:00 PM
Re: YARV? by Werner Schuster Posted Apr 11, 2007 12:29 PM
Re: YARV? by Charles Nutter Posted Apr 11, 2007 5:44 PM
Re: YARV? by Werner Schuster Posted Apr 13, 2007 6:57 AM
Re: YARV? by Charles Nutter Posted Apr 16, 2007 2:25 PM
FYI, JRuby also supports pooling native threads by Charles Nutter Posted Apr 11, 2007 5:47 PM
  1. Back to top

    YARV?

    Apr 11, 2007 12:00 PM by Tom Nichols

    Does YARV plan to provide native threads?

  2. Back to top

    Re: YARV?

    Apr 11, 2007 12:29 PM by Werner Schuster

    Yep. See this slides from a talk on YARV from last year's RubyConf: http://www.atdot.net/yarv/rc2006_sasada_yarv_on_rails.pdf (there's a bunch of slides - further in the back - that talk about which models for this are possible and which is likely to be chosen);

  3. Back to top

    Re: YARV?

    Apr 11, 2007 5:44 PM by Charles Nutter

    Yes, but see slide 38...YARV supports native threads, but they can't run in parallel. So the same problems would exist. Thread A holds the giant lock for the whole VM and goes off to call a third-party library. No other native thread can get the lock during that time, so we're back to the same state.

  4. Back to top

    FYI, JRuby also supports pooling native threads

    Apr 11, 2007 5:47 PM by Charles Nutter

    Spinning up a native thread for every Ruby thread is sometimes not too scalable, so JRuby also supports a pooling mechanism where threads are reused. This results in new thread launching being extremely cheap, but still using a native thread. When that thread completes the native thread returns to the pool to be reused. Since many of the use cases for green threads are to allow lightweight spinning up of many short-lived threads, this basically covers the main reasons folks believe green threads scale better. And with JRuby you can choose 1:1 threading or pooled threading without giving up kernel-level threads.

  5. Back to top

    Re: YARV?

    Apr 13, 2007 6:57 AM by Werner Schuster

    @Charles: you're right, it seems that YARV will use a GVML (Global VM Lock) approach for now. But look at this: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/10252 This means that long running or blocking calls _can_ run in parallel by explicitely releasing the GVML before the operation. So, as long as the library responsible for the long running/blocking operation is a well behaved YARV citizen, this should be a solution. In the linke post, SASADA Koichi says:

    "(2) You can't call Ruby's method from solve_difficult_numerical_problem(). I'll prepare API to support this."
    I'm not sure what exactly he means by "Ruby's method" - either a Ruby method or a method of the Ruby VM, but it seems there'll be other solutions. Making a VM thread safe is a bey-ach job to do... errr, I mean it's hard - listen to this podcast: http://www.cincomsmalltalk.com/blog/blogView?showComments=true&entry=3346274610 where Dave Griswold talks about how the Strongtalk VM was turned into what is now Hotspot, and how long it took to actually make the damn thing thread safe.

  6. Back to top

    Re: YARV?

    Apr 16, 2007 2:25 PM by Charles Nutter

    @Charles: you're right, it seems that YARV will use a GVML (Global VM Lock) approach for now. But look at this: blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby... This means that long running or blocking calls _can_ run in parallel by explicitely releasing the GVML before the operation. So, as long as the library responsible for the long running/blocking operation is a well behaved YARV citizen, this should be a solution.
    Yes, I figured that would be true...but the problem is that you have to release that lock, which sucks. In JRuby, there's no such locking. The only time two threads might contend for locks is on data structures (synchronized just enough to maintain internal integrity) and when one thread wants to change the state of another thread (like killing, raising, etc). Calls out to Java code won't block other threads from executing, and there's no need to actively acquire or release locks at the borders. I understand the difficulty making MRI threadsafe though and I sympathize. It's a brutal task even in a world where segfaults are impossible (i.e. synchronizing Java code after-the-fact) but doing it in C when you've got a dozen pieces of code all mutating the same piece of memory in different ways...is absolutely terrifying.

Exclusive Content

Rationalizing the Presentation Tier

Thin client paradigm characterized by web applications is a kludge that needs to be repudiated. Old compromises are no longer needed and it's time to move the presentation tier to where it belongs.

Agile Project Management: Lessons Learned at Google

In this presentation filmed during QCon 2007, Jeff Sutherland, the creator of Scrum, talks about his visit at Google to do an analysis of Google's first implementation of Scrum.

AtomServer – The Power of Publishing for Data Distribution

In this article, Bryon Jacob and Chris Berry introduce AtomServer, their implementation of a full-fledged Atom Store based on Apache Abdera, which is now available as open source.

An Introduction to Virtualization

It is easy to think that virtualization applies only to servers. In reality the recent resurgence of the concept is also being applied to networking, storage, and application infrastructure.

REST Anti-Patterns

In this article, Stefan Tilkov explains some of the most common anti-patterns found in applications that claim to follow a "RESTful" design and suggests ways to avoid them.

Choosing between Routing and Orchestration in an ESB

In this article, Adrien Louis and Marc Dutoo discuss the differences and relative merits of using orchestration vs. routing in a typical ESB setup, and discuss various implementation options.

Enterprise Batch Processing with Spring

Wayne Lund discusses batch processing, Spring Batch objectives and features, scenarios for usage, Spring Batch architecture, scaling, example code, failures and retrying, and the future roadmap.

User Story Estimation Techniques

Developer Jay Fields draws on his experiences as a ThoughtWorks consultant to describe effective user story estimation techniques.