InfoQ

News

Ruby 1.9 adds Fibers for lightweight concurrency

Posted by Werner Schuster on Aug 24, 2007 09:00 AM

Community
Ruby
Topics
Performance & Scalability,
Dynamic Languages,
Programming
Tags
Multi-threading,
Parallel Programming,
Language Features,
Continuations,
Threading
Threading in Ruby has been a topic of discussion for a long time. Whether future Ruby versions (1.9 and beyond) will use kernel threads instead of userspace threads, is still to be decided. Recently, another path for this set of problems has arrived in Ruby. David Flanagan points out a a new feature in the Ruby 1.9 branch called Fibers:
Here's how to use the new Fiber class (warning: class name may change) to generate an infinite sequence of Fibonacci numbers. I use "generate" in the sense of Python's generators. Ruby's new fibers are "semi-coroutines".

1  fib = Fiber.new do  
2 x, y = 0, 1
3 loop do
4 Fiber.yield y
5 x,y = y,x+y
6 end
7 end
8 20.times { puts fib.resume }
The code prints out the first 20 Fibonacci numbers. The concept it uses is called Coroutine. Basically, invoking Fiber.yield call stops ("suspends") execution of this code (don't confuse it with yield used for executing blocks). If you're familiar with debuggers, imagine hitting "suspend" on a thread or seeing the thread hit a breakpoint. fib is a handle to this Fiber, and can be used to manipulate it. The fib.resume call in line 8, does exactly what it says: it resumes the execution of the Fiber's code with the statement after the Fiber.yield call.

Line 4 shows Fiber.yield called with a parameter y. In a way, this can be thought of as similar to return y. The difference between a subroutine's return and a coroutine's Fiber.yield is what happens to the context of the code after the call. return means that, say, a function call's activation frame (or stack frame) is deallocated, which means all local variables are gone. With a Coroutine, yielding keeps the activation frame and all the data therein alive, so the code can use it when it'sresumed later.

Now it becomes clear how the code works: it's just a loop that iteratively calculates one Fibonacci number after the other. Once it's done with one, it suspends itself, giving someone else the control of the CPU. When that code wants the next number in the sequence, it simply resumes the Fibonacci code, which runs another iteration and then hands off control (and the next Fibonacci number) by suspending itself with Fiber.yield

This is a rather new addition to the Ruby 1.9 branch, and it seems that specifics aren't yet decided. The term Fiber might be familiar to Windows programmers. The MSDN entry explains them as such:
A fiber is a unit of execution that must be manually scheduled by the application. Fibers run in the context of the threads that schedule them. Each thread can schedule multiple fibers. In general, fibers do not provide advantages over a well-designed multithreaded application.However, using fibers can make it easier to port applications that were designed to schedule their own threads.
Sasada Koichi, developer of the Ruby 1.9 VM, formerly known as YARV, gives some more information on the ruby-core mailing list
These method names (resume/yield) are from Lua. "transfer" is from Modula-2. "double resume error" is from Python's generator. BTW, I'm thinking about name "Fiber". Current Fiber means Semi-Coroutine. Fiber::Core is Coroutine. Yes, name of Fiber is from Microsoft, but it's means Semi-Coroutine such as Lua's coroutine and Python's generator.
Semi-Coroutines are asymmetric Coroutines which are limited in their choice of transfer of control. Asymmetric Coroutines can only transfer control back to their caller, where Coroutines are free to transfer control to any other Coroutine, as long as they have a handle to it.
The example above shows how a Semi-Coroutine is used as a Generator, i.e. to conveniently generate Fibonacci numbers. Some languages, such as Python, support Generators in the language and have special syntax for it. From the quote, it seems that both Semi-Coroutine (Fiber) and Coroutine (Fiber::Core) behavior is supported. What will eventually show up in Ruby 1.9 and beyond and how it will be named remains to be seen, but Yukihiro Matsumoto, creator of the Ruby language, considers them safe:
It is still hot topic among core developers. But fibers (and external iterators) are likely to remain in the final 1.9, more likely than continuations.
Note: Continuations, a feature long absent in the Ruby 1.9 branch, were added to Ruby 1.9 in May despite concerns about whether they were feasible with Ruby 1.9's kernel threads.

Next to implementing control structures, Coroutines provide a way to use lightweight concurrency. In effect, they allow to implement userspace threads with cooperative scheduling. The Coroutines can either yield control to each other, or have centralized scheduling by handing off control to one scheduler Coroutine which then decides who gets scheduled next.

This can address concerns about Ruby 1.9's move to the more heavyweight kernel threads. Ruby 1.8 threads are built as a userspace threading system, which has the benefit of less thread management overhead. Creating a kernel thread involves a syscall to the OS, which takes more time than a single in-process call to a threading system. JRuby, for instance, uses kernel threads, but tries to offset the creation cost by using a thread pool.

Nevertheless, creating a lot of kernel threads still has a lot of overhead, or might simply cause problems on OSes that have hard thread limits or struggle with large numbers of threads. It's in these cases, when a lightweight alternative is useful. It allows the code to be split among  threads, if that is the logic, straightforward solution, but keeps the overhead down. Another advantage of the solution is that kernel threads are still available if a long running operation or syscall needs to be invoked but must not block the execution of all code in the process.

A similar approach is used in Erlang which also provides lightweight processes, with the difference that Erlang processes share nothing, whereas Fibers share an address space. However, the availability of Fibers allows Actor-style programming, without having to worry about overhead.

Fibers are also not absolutely new in the Ruby space. Rubinius has Tasks, which are described as similar to Ruby 1.9 Fibers. (InfoQ recently featured with Rubinius project lead Evan Phoenix on this threading in Rubinius). MenTaLguY details this on ruby-core:
In modern concurrency settings they [Fibers] are becoming increasingly useful, however. Without them, or something like them (e.g. Rubinius Tasks), you must play some very ugly games to get lightweight concurrency -- see the use of explicit continuation-passing (functions, not Continuations) in Scala's actors library for an example of the best that can be hoped for in their absence.
Granted, Fibers will make things harder for JRuby
This last comment, brings up an important point. If Fibers get adopted in Ruby, this will create headaches for Ruby implementations targeting the JVM or CLR, such as JRuby, XRuby, Ruby.NET or IronRuby. None of them currently support Continuations because manipulating or reading the callstack is hard or impossible to do in these VMs. The lack of Continuations is a controversial issue, but doesn't seem to have caused problems with e.g. JRuby, because they are not widely used in Ruby. The only use of them in the Ruby 1.8 standard library is the Generator implementation, but, for instance, JRuby 1.0, solved this by implementing the same functionality without using Continuations.

While it's certainly possible to implement these features using workarounds, the question is whether these workaround will cause performance regressions. If, for instance, call stacks must be emulated on the heap, instead of using the VM's stack, this can lead to lower performance or prevent (JIT) compiler optimizations from being applied. Workarounds for asymmetric Coroutines would be easier to do, as they could make use of the VM's stack for method invocations. Languages such as C# implement their Iterator feature, which allows to write Generators similar to the sample code above, this way.

4 comments

Reply

Generators in python? by Michael Neale Posted Aug 26, 2007 6:14 PM
Re: Generators in python? by Werner Schuster Posted Aug 26, 2007 7:09 PM
I may be missing something here by Frank Carver Posted Sep 4, 2007 5:04 AM
Re: I may be missing something here by Jaux Liu Posted Jan 3, 2008 10:25 PM
  1. Back to top

    Generators in python?

    Aug 26, 2007 6:14 PM by Michael Neale

    This looks like a slightly overcomplicated version yield/generators in python. Not really sure what the difference is on first look.

  2. Back to top

    Re: Generators in python?

    Aug 26, 2007 7:09 PM by Werner Schuster

    Generators in Python only yield to their caller, whereas Coroutines can yield to any other Coroutine - these are the Asymmetric Coroutines mentioned in the article. So: Coroutines are the more general concept... from what I can gather from the ruby-core discussions, it seems like Fibers provide the functionality for all this. Everything else - Generators, External Iterators etc, is built on top of them. Again: this topics seems to be in flux, so things might change. It's possible to emulate Coroutines with Generators to a certain degree: http://www.oreillynet.com/onlamp/blog/2004/07/my_own_take_on_interactive_gen.html http://www.ibm.com/developerworks/linux/library/l-pygen.html These articles describe a method where Coroutines are emulated by having a central scheduler which creates and calls the 'Coroutines', which are in fact Generators. One 'Coroutine' can yield to another 'Coroutine' by yielding to the scheduler with a handle to the 'Coroutine'; the scheduler then searches his list of 'Coroutines' and calls it's Generator next method on it. There's all kinds of work being done on this in the Python space, with things like Stackless Python http://en.wikipedia.org/wiki/Stackless_Python

  3. Back to top

    I may be missing something here

    Sep 4, 2007 5:04 AM by Frank Carver

    Can someone explain why (except for a bit of syntax sugar) this is different from just creating and using an object with persistent member values. For example, something like:

    class Fibber
      $x, $y = 0, 1 
      def fib
        $x,$y = $y,$x+$y 
        $x
      end
    end
    fib = Fibber.new
    20.times { puts fib.fib }
    
    This feels much more like Object-Oriented "business as usual" than any form of concurrency.

  4. Back to top

    Re: I may be missing something here

    Jan 3, 2008 10:25 PM by Jaux Liu

    The following is I quoted from my course slide, it might help. "A coroutine properly handles the class of problems that require state information to be re- tained between successive calls (e.g. finite state machine)." Did you miss the first zero in the sequence, so did the code in the article? Sorry if I'm wrong, I don't know Ruby.

Exclusive Content

Tapestry for Nonbelievers

A new article by I. Drobiazko and R. Zubairov introduces v. 5 of the Apache Tapestry component-oriented web framework. The tutorial shows how to create a component and covers IoC in Tapestry and Ajax.

Pete Lacey on REST and Web Services

In this interview, Burton Group consultant Pete Lacey talks to Stefan Tilkov about his disillusionment with SOAP, his opinion on REST, and addresses some of the perceived shortcomings REST vs. WS-*.

Business Natural Languages Development in Ruby

Jay Fields presents his concept of Business Natural Languages - a type of Domain Specific Languages geared towards being readable by domain experts.

Distributed Version Control Systems: A Not-So-Quick Guide Through

Adoption and interest for Distributed Version Control Systems is constantly rising. We will introduce the concept of DVCS and have a look at 3 actors in the area: git, Mercurial and Bazaar.

Segundo Velasquez and Agile as Seen Through the Customer's Eyes

Deborah Hartmann interviewed Segundo Velasquez about his experience as customer with an Agile team during the initial phase of software design of a product.

Fine Grained Versioning with ClickOnce

David Cooksey shows how to fine grained versioning to a ClickOnce deployment using an HttpHandler written with ASP.NET, making partial rollouts to a test audience much easier.

Implementing Manual Activities in Windows Workflow

Windows workflow (WF) is an excellent framework for implementing business processes, but lacks support for human activities. This article describes a completely generic approach for changing this.

Markus Voelter about Software Architecture Documentation

In this interview taken during OOPSLA 2007, Markus Voelter talks about the importance of documenting the software architecture, and gives some good and also bad examples on how it could be done.