InfoQ

News

Using ParseTree for LINQ-style queries and extracting metadata

Posted by Werner Schuster on Feb 14, 2008 03:00 PM

Community
Ruby
Topics
Code Analysis ,
Domain Specific Languages ,
Language
Tags
Metaprogramming ,
Merb ,
ParseTree ,
LISP ,
LINQ
With the introduction of  LINQ in .NET and the resurging interest in LISP, a certain type of metaprogramming has received renewed attention. In LINQ, it's possible to get the Expression Trees, i.e. a tree based representation, of a piece of code.

In LISP (and similar languages), this approach is known as a macro or macro expansion. Macros look like function calls, with the difference that they're evaluated at compile time, i.e. when the code is loaded. The macro gets the Abstract Syntax Tree (AST) of the macro call, but the call is then replaced with the AST the macro returns. This means that the macro call is not the code that actually gets executed, instead it's the returned AST of the macro that gets executed the, i.e. a macro call is expanded to the actual code.

While Ruby doesn't have language support for accessing the AST of a piece of code, there are libraries to handle that. The most popular one is ParseTree, which returns the AST as a s-expr representation, i.e. nested lists of symbols and literals. A group of useful tools are built by the providers of ParseTree, such as
  • Ruby2Ruby
    The tool takes ParseTree ASTs and formats them as Ruby source code. This allows to parse Ruby code, modify it at the AST level (instead of working with the characters of the source code) and finally generate runnable Ruby code again.
  • Heckle
    A tool by Ryan Davis and Kevin Clark that uses Ruby2Ruby to introduce random changes into code to find code with insufficient test coverage.

A new set of libraries is now making use of ParseTree in a way reminiscent of LINQ. Ambition allows to write queries in Ruby syntax, e.g.
LDAP::User.select { |m| m.name == 'jon' && m.age == 21 } 
Or
SQL::User.select { |m| m.name == 'jon' && m.age == 21 } 
The code inside these Blocks is never actually executed - instead ParseTree is used to get at the AST. This is then analyzed and translated into queries for the target query language. Ambition features extensible adapters, which allow to write new translators from Ruby ASTs to query languages.

Another library that adds this style of queries is Sequel. While it's primarily an ORM, Sequel also allows to write queries in Ruby:
old_nonruby_posts = posts.filter {:stamp > 1.month.ago && :category != 'ruby'} 
It's important to note that, unlike Ambition, this is just one of Sequel's ways of writing queries - it also allows to put the queries in string literals.

A very different way of making use of the AST of Ruby code can be found in Merb. It is used in Parameterized Actions:
Parameterized Actions:
If you specify parameters in your action methods, incoming query parameters will automatically get assigned as appropriate. Some examples:
class Foos < Merb::Controller
 def index(id, search_string = "%")
 @foo = Foo.find_with_search(id, search_string)
 end
 end
Going to /foos/index/12 will call the index method with the parameters "12" and "%" (the default provided). Going to /foos/index will throw a BadBehavior error (status code 400) because id is a required parameter, but it was not passed in. Going to /foos/index/5?search_string=hello will call the index method with parameters "5" and "hello". The bottom line is that you get to use your actions like real methods.
The feature is implemented by looking getting the AST of the method that handles the action, and extracting the default arguments. In a way, this allows a kind of introspection/reflection that's not normally available.

These examples show the power of this type of introspection. However, Sequel and Merb also show one downside of this approach: the ParseTree based features are optional, i.e. the tools don't rely on them. If ParseTree is not available on the system, these features are simply not available. This is due ParseTree's nature as a native Ruby extension. Some of the deployment problems of a native extension are being solved, eg. ParseTree on Windows or ParseTree on MacOS X.
Problems remain though. ParseTree doesn't support Ruby 1.9 yet, although possible solutions are being considered. Ruby 1.9 actually comes with some access to ASTs with Ripper. There's very little information available about Ripper, but one way of using it is as a SAX-style parser. Eg.
require 'ripper'
 class MyRipper < Ripper
 def on_gvar(node)
 puts node
 end
 def on_int(node)
 puts node
 end
 # etc.
 # Handle each element of the AST with an on_* method
end
This can then be used as such:
f = MyRipper.new("$foo = 1") 
f.parse
Next to Ruby 1.9 support, ParseTree also has varying support on alternative Ruby implementations. Rubinius makes heavy use of ParseTree AST representation. JRuby has a nearly complete port of ParseTree, but the .NET based Ruby implementations seem to be without support for now.
Haml safemode by Werner Schuster Posted Feb 18, 2008 6:59 AM
  1. Back to top

    Haml safemode

    Feb 18, 2008 6:59 AM by Werner Schuster

    Another use of ParseTree: http://www.artweb-design.de/2008/2/17/sending-ruby-to-the-jail-an-attemp-on-a-haml-safemode Basically this allows to jail code in Haml templates.

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.