BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Ruby 1.9 adds Fibers for lightweight concurrency

Ruby 1.9 adds Fibers for lightweight concurrency

Bookmarks
Threading in Ruby has been a topic of discussion for a long time. Whether future Ruby versions (1.9 and beyond) will use kernel threads instead of userspace threads, is still to be decided. Recently, another path for this set of problems has arrived in Ruby. David Flanagan points out a a new feature in the Ruby 1.9 branch called Fibers:
Here's how to use the new Fiber class (warning: class name may change) to generate an infinite sequence of Fibonacci numbers. I use "generate" in the sense of Python's generators. Ruby's new fibers are "semi-coroutines".

1  fib = Fiber.new do  
2 x, y = 0, 1
3 loop do
4 Fiber.yield y
5 x,y = y,x+y
6 end
7 end
8 20.times { puts fib.resume }
The code prints out the first 20 Fibonacci numbers. The concept it uses is called Coroutine. Basically, invoking Fiber.yield call stops ("suspends") execution of this code (don't confuse it with yield used for executing blocks). If you're familiar with debuggers, imagine hitting "suspend" on a thread or seeing the thread hit a breakpoint. fib is a handle to this Fiber, and can be used to manipulate it. The fib.resume call in line 8, does exactly what it says: it resumes the execution of the Fiber's code with the statement after the Fiber.yield call.

Line 4 shows Fiber.yield called with a parameter y. In a way, this can be thought of as similar to return y. The difference between a subroutine's return and a coroutine's Fiber.yield is what happens to the context of the code after the call. return means that, say, a function call's activation frame (or stack frame) is deallocated, which means all local variables are gone. With a Coroutine, yielding keeps the activation frame and all the data therein alive, so the code can use it when it'sresumed later.

Now it becomes clear how the code works: it's just a loop that iteratively calculates one Fibonacci number after the other. Once it's done with one, it suspends itself, giving someone else the control of the CPU. When that code wants the next number in the sequence, it simply resumes the Fibonacci code, which runs another iteration and then hands off control (and the next Fibonacci number) by suspending itself with Fiber.yield

This is a rather new addition to the Ruby 1.9 branch, and it seems that specifics aren't yet decided. The term Fiber might be familiar to Windows programmers. The MSDN entry explains them as such:
A fiber is a unit of execution that must be manually scheduled by the application. Fibers run in the context of the threads that schedule them. Each thread can schedule multiple fibers. In general, fibers do not provide advantages over a well-designed multithreaded application.However, using fibers can make it easier to port applications that were designed to schedule their own threads.
Sasada Koichi, developer of the Ruby 1.9 VM, formerly known as YARV, gives some more information on the ruby-core mailing list
These method names (resume/yield) are from Lua. "transfer" is from Modula-2. "double resume error" is from Python's generator. BTW, I'm thinking about name "Fiber". Current Fiber means Semi-Coroutine. Fiber::Core is Coroutine. Yes, name of Fiber is from Microsoft, but it's means Semi-Coroutine such as Lua's coroutine and Python's generator.
Semi-Coroutines are asymmetric Coroutines which are limited in their choice of transfer of control. Asymmetric Coroutines can only transfer control back to their caller, where Coroutines are free to transfer control to any other Coroutine, as long as they have a handle to it.
The example above shows how a Semi-Coroutine is used as a Generator, i.e. to conveniently generate Fibonacci numbers. Some languages, such as Python, support Generators in the language and have special syntax for it. From the quote, it seems that both Semi-Coroutine (Fiber) and Coroutine (Fiber::Core) behavior is supported. What will eventually show up in Ruby 1.9 and beyond and how it will be named remains to be seen, but Yukihiro Matsumoto, creator of the Ruby language, considers them safe:
It is still hot topic among core developers. But fibers (and external iterators) are likely to remain in the final 1.9, more likely than continuations.
Note: Continuations, a feature long absent in the Ruby 1.9 branch, were added to Ruby 1.9 in May despite concerns about whether they were feasible with Ruby 1.9's kernel threads.

Next to implementing control structures, Coroutines provide a way to use lightweight concurrency. In effect, they allow to implement userspace threads with cooperative scheduling. The Coroutines can either yield control to each other, or have centralized scheduling by handing off control to one scheduler Coroutine which then decides who gets scheduled next.

This can address concerns about Ruby 1.9's move to the more heavyweight kernel threads. Ruby 1.8 threads are built as a userspace threading system, which has the benefit of less thread management overhead. Creating a kernel thread involves a syscall to the OS, which takes more time than a single in-process call to a threading system. JRuby, for instance, uses kernel threads, but tries to offset the creation cost by using a thread pool.

Nevertheless, creating a lot of kernel threads still has a lot of overhead, or might simply cause problems on OSes that have hard thread limits or struggle with large numbers of threads. It's in these cases, when a lightweight alternative is useful. It allows the code to be split among  threads, if that is the logic, straightforward solution, but keeps the overhead down. Another advantage of the solution is that kernel threads are still available if a long running operation or syscall needs to be invoked but must not block the execution of all code in the process.

A similar approach is used in Erlang which also provides lightweight processes, with the difference that Erlang processes share nothing, whereas Fibers share an address space. However, the availability of Fibers allows Actor-style programming, without having to worry about overhead.

Fibers are also not absolutely new in the Ruby space. Rubinius has Tasks, which are described as similar to Ruby 1.9 Fibers. (InfoQ recently featured with Rubinius project lead Evan Phoenix on this threading in Rubinius). MenTaLguY details this on ruby-core:
In modern concurrency settings they [Fibers] are becoming increasingly useful, however. Without them, or something like them (e.g. Rubinius Tasks), you must play some very ugly games to get lightweight concurrency -- see the use of explicit continuation-passing (functions, not Continuations) in Scala's actors library for an example of the best that can be hoped for in their absence.
Granted, Fibers will make things harder for JRuby
This last comment, brings up an important point. If Fibers get adopted in Ruby, this will create headaches for Ruby implementations targeting the JVM or CLR, such as JRuby, XRuby, Ruby.NET or IronRuby. None of them currently support Continuations because manipulating or reading the callstack is hard or impossible to do in these VMs. The lack of Continuations is a controversial issue, but doesn't seem to have caused problems with e.g. JRuby, because they are not widely used in Ruby. The only use of them in the Ruby 1.8 standard library is the Generator implementation, but, for instance, JRuby 1.0, solved this by implementing the same functionality without using Continuations.

While it's certainly possible to implement these features using workarounds, the question is whether these workaround will cause performance regressions. If, for instance, call stacks must be emulated on the heap, instead of using the VM's stack, this can lead to lower performance or prevent (JIT) compiler optimizations from being applied. Workarounds for asymmetric Coroutines would be easier to do, as they could make use of the VM's stack for method invocations. Languages such as C# implement their Iterator feature, which allows to write Generators similar to the sample code above, this way.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Generators in python?

    by Michael Neale,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This looks like a slightly overcomplicated version yield/generators in python. Not really sure what the difference is on first look.

  • Re: Generators in python?

    by Werner Schuster,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Generators in Python only yield to their caller, whereas Coroutines can yield to any other Coroutine - these are the Asymmetric Coroutines mentioned in the article. So: Coroutines are the more general concept... from what I can gather from the ruby-core discussions, it seems like Fibers provide the functionality for all this. Everything else - Generators, External Iterators etc, is built on top of them. Again: this topics seems to be in flux, so things might change.

    It's possible to emulate Coroutines with Generators to a certain degree:
    www.oreillynet.com/onlamp/blog/2004/07/my_own_t...
    www.ibm.com/developerworks/linux/library/l-pyge...
    These articles describe a method where Coroutines are emulated by having a central scheduler which creates and calls the 'Coroutines', which are in fact Generators. One 'Coroutine' can yield to another 'Coroutine' by yielding to the scheduler with a handle to the 'Coroutine'; the scheduler then searches his list of 'Coroutines' and calls it's Generator next method on it.

    There's all kinds of work being done on this in the Python space, with things like Stackless Python en.wikipedia.org/wiki/Stackless_Python

  • I may be missing something here

    by Frank Carver,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Can someone explain why (except for a bit of syntax sugar) this is different from just creating and using an object with persistent member values. For example, something like:


    class Fibber
    $x, $y = 0, 1
    def fib
    $x,$y = $y,$x+$y
    $x
    end
    end
    fib = Fibber.new
    20.times { puts fib.fib }


    This feels much more like Object-Oriented "business as usual" than any form of concurrency.

  • Re: I may be missing something here

    by Ye Liu,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    The following is I quoted from my course slide, it might help.

    "A coroutine properly handles the class of problems that require state information to be re-
    tained between successive calls (e.g. finite state machine)."

    Did you miss the first zero in the sequence, so did the code in the article? Sorry if I'm wrong, I don't know Ruby.

  • Re: I may be missing something here

    by Roger Pack,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    It's tough to tell from the example [which was explaining the syntax of using them, not how they are used].
    The basic difference is that you don't have to use those globals, as it saves context from call to call.
    Think of it as if it were threaded, above. The benefit of threads being that they keep track of local variables and context [local to themselves] very nicely, so you don't have to worry about storing it away in a global then reloading it when you go back.

    The other benefit of 'real' threads [that they automatically interleave] isn't present with these. But the variable storing stuff is.

  • Re: I may be missing something here

    by Sam Danielson,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    The fib example isn't very good. Normally when a method yields it just starts a new invocation of a block that was passed to it. Execution continues after the yield when the block exits. It's just syntax sugar for calling a function, really. Ruby is a bit weird and allows blocks to force their caller to return, creating the illusion that the code passed to the iterator is in control. Actually it's not. Code passed to a '(1..10).each' iterator is just a function that must exit before the next iteration.

    Unlike objects with regular yield you should think of Fibers as having their own stack. Here's a contrived example that counts down from 10 but uses recursion to do it. Notice that the while loop is a consumer of the @nester Fiber. There is no consuming block that must exit on each iteration. In your example the fib method must exit on each iteration and in #each loops the consumer must exit. Neither the consumer nor the producer must exit during iteration with Fibers.


    def nest(a)
    if a > 0
    Fiber.yield a
    loop(a - 1)
    end
    end

    @nester = Fiber.new do
    nest(10)
    end

    while (a = @nester.resume) do
    puts a
    end


    Recursion is nice for some things (though counting isn't typically one of them) and that's the real advantage of a fiber. It allows you to put recursion in your generators and call them from your own possibly recursive code.

  • Re: I may be missing something here

    by Sam Danielson,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Oops. In nest(), change loop(a - 1) to nest(a - 1). I'm not very good with names.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT