Static Analysis Tools Roundup: Roodi, Rufus, Reek, Flay
Static analysis tools allow to keep code quality up and warn of potential bugs. Compilers in statically compiled languages often run static analysis checks and report potential problems as warnings. Popular stand-alone tools are C's lint or Smalltalk Lint, many modern IDEs also perform static analysis on code, often incrementally as code is edited.
Static analysis tools for Ruby for a long time suffered from the lack of a standard way of accessing the Abstract Syntax Tree (AST) of Ruby source. One solution was the ParseTree gem, which uses a native extension to access the parse tree of parsed Ruby code. One problem of ParseTree include the dependency on native code to run. ParseTree is also only available on Ruby 1.8, but is unlikely to be supported on 1.9 (Ruby 1.9 comes with
Ripper, a library that allows to parse source files but not access the parse trees at runtime). The ParseTree support across new Ruby implementations is inconsistent at the moment.
The introduction of ruby_parser, a Ruby parser written in Ruby promises to fix these problems. The project was recently released in 2.0 version, which improved performance and, importantly, added line numbers as metadata to the ASTs. The latter information is crucial for static analysis tools as they need to report the location of a discovered problem.
An important point, considering that all current Ruby IDEs written either in Java (Eclipse based IDEs such as Aptana or 3rdRail, Netbeans' Ruby support, JetBrains' RubyMine) or .NET (Ruby In Steel, based on VS). All of these IDEs also feature static analysis code of Ruby code, but none of it is written in Ruby. Static analysis code based on a Java or .NET based Ruby parser and AST obviously doesn't run on MRI or other Ruby implementations. UnifiedRuby is a cleaned up version of ParseTree's output, and in combination with ruby_parser, it's now possible to parse Ruby source code and analyze in pure Ruby.
A growing list of static analysis tools has become available in the past few months.
Flay, written by Ryan Davis, checks codebases for duplicates. By using an AST, instead of the source code, it's possible to compare the code structurally. Copy/pasted code can so be detected even if, say, literal values were modified. Ryan has previously released another static analysis tool flog, which calculates a score for a codebase which depends on various patterns considered bad, eg. large numbers of dependencies. Both flay and flog can be used from the command line to check code bases. Flay uses ruby_parser to parse Ruby code.
Reek by Kevin Rutherford is a "a code smells detector for ruby". It comes with a list of checks which detect long method bodies, large classes, bad names, etc. The checks are written as
SexpProcessor subclasses, which works as a visitor over the AST. Reek's code is hosted at Github.
Roodi is similar to reek in that it allows to run a list of checks over a codebase. Roodi comes with checks that ensure methods or modules comply with a naming convention, max parameter count, etc. Other checks include advice such as avoiding
for loops, etc. The shipped checks can also be easily configured with a YAML file. New checks can be easily written as well. A checker class registers the types of AST nodes it's interested in and can then handle the matched subtrees.
Rufus by John Mettraux allows to check Ruby for unwanted or unsafe code. The Rufus library allows to check some Ruby source code before loading it. Eg. loading a Ruby file that consists of a single line like
exit is probably a bad idea. The library can be configured with custom patterns of code to be excluded.
Do you plan to add one or more of these tools to your continuous integration setup? What checks would you like to see or write?