InfoQ

News

Inside the full speed Rubinius debugger

Posted by Werner Schuster on Jan 23, 2008 03:57 PM

Community
Ruby
Topics
Technology ,
Performance & Scalability ,
Debugging ,
Runtimes
Tags
SmallTalk ,
Rubinius
Debugger support is available in Ruby - a command line debugger has been shipped with it for a long time. Speed, however, was a problem. Ruby debuggers were implemented using Ruby's tracing feature, i.e. a callback Block or function is called before a line of Ruby code is executed. This callback checked if the thread was suspended or if a breakpoint exists on this line.

The default Ruby debugger is implemented in Ruby, with a Ruby block simply set with set_trace_func. Faster versions of this approach were implemented in C (ruby-debug, Cylon debugger) and Java (jruby-debug). However, no matter how fast the callback is executed, the problem of this solution remains: as soon as the debugger is started, every line of Ruby code incurs an overhead.

An ideal solution for this has no overhead and simply suspends the thread when a breakpoint is hit. I.e. a breakpoint has no cost (in CPU cycles) until it's hit.  This is the approach that Rubinius' full speed debugger uses - with "full speed" meaning that a program runs at it's normal speed even if it's being debugged.

Rubinius' full speed debugger is made possible by these characteristics:
  • Rubinius compiles all Ruby code into instructions (op_codes) which the shotgun VM runs - currently with a op_code interpreter
  • The full speed debugger functionality introduced a new instruction yield_debugger which - when executed - notifies the debugger thread on a defined debugging channel (Channels are a kind of pipe - i.e. data sent into it can be received on the other end).
  • It's possible to access the bytecode for a Method - actually it's trivial to do so. Here an example with String's to_s method:
    m = "".method(:to_s) cm = b.compiled_method
    # this yields an array of InstructionSet::Opcode objects cm.bytecodes.decode
  • Various utility methods help map instruction offsets to line numbers, such as CompiledMethod's first_ip_on_line etc.
With this functionality in place, setting a breakpoint on a certain line of a method is simple:
  • Take the Method object and get it's CompiledMethod object.
  • Figure out the position of the first instruction of the line of the breakpoint.
  • Exchange the instruction at this position with yield_debugger. The original instruction is kept around in a management data structure.
  • After the breakpoint is hit and the user continues execution, the original instruction is executed and then normal execution of the code resumes.
This functionality has been around for some time (see InfoQ's interview with Evan Phoenix on topics such as debugging). However, the full speed debugger has now become usable for regular users with work from Adam Gardiner, who added a command line user interface for the Rubinius debugger and necessary commands. Not just that, he implemented the code that allows the user to step through code line by line. This involved simply setting a breakpoint on the next line after the current one. Of course, this also needs to know if the current line is the last in a method - but this is also possible in Rubinius by getting a handle to the method that called the current one. The context object, i.e. the activation frame or stack frame of the method, has a sender method doing just that.

Using the debugger is simple. Once you have Rubinius (see how to check out and compile Rubinius), start irb by running:
shotgun/rubinius 
Then execute this:
Rubinius::VM::debugger 
(Note: just typing debugger also works at the moment). This will put you in the debugger's text interface - available commands can be seen with the "?" command, and include managing breakpoints and features like looking at the op_codes and the Ruby source of methods among others.

The full speed debugger gives Rubinius a big edge versus Ruby implementations that need to rely on tracing based debuggers (no matter how fast these implementations are). It's also important to note: the complete debugger functionality was written in Ruby - with the exception of the handful of lines of C code for the yield_debugger instruction.

Have you tried Rubinius yet? Do you have ideas what you could do with Rubinius' transparency and accessible internals, i.e. accessing and modifying bytecode at runtime, inspecting the call stack etc.?

Also: check out InfoQ's previous coverage of Rubinius.

No comments

Watch Thread Reply

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.