Ruby's Open Classes - Or: How Not To Patch Like A Monkey
Rails developers who watched the recent Ruby 1.8.7 preview releases, soon noticed something about the 1.8.7 Preview 1: it broke Rails. The reason was the addition of a method
Symbol#to_proc, which was backported from Ruby 1.9. Adding this method allows to write certain code in a more compact way (see details about
So what happened? Rails had already added the
to_proc method to
Symbol. However... the method Ruby 1.8.7 Preview 1 added had a slightly different behavior than the one Rails added.
Fortunately, Rails has quite a few users, so the problem was quickly reported, and the final version of Ruby 1.8.7 has a version of
Symbol#to_proc that works.
The ProblemRuby's Open Classes are a useful feature that allows to add methods to an already loaded class, as simple as
puts "".foo # prints "foo"
The problem of Open Classes is quite easy to see, we can use an old and undisputed principle of software design: Modularity. Over the years, an enormous amount of concepts has been developed to gain modularity in light of growing code bases. From local variables (vs. global variables), lexical scoping (vs. dynamic scoping), numerous namespace systems, etc. This is a process that is still ongoing - just consider the ongoing idea of doing "component oriented" development and the idea that software should be composable in the same way as physical components are assembled into products. Modularity, as we see, is an important property of software.
So it is this property that speaks out against Open Classes and freewheeling Monkeypatching (as this feature is also known, particularly in the Python community). Any library developer who opens an existing class must answer this: is this added method really so absolutely necessary in this class that I must break modularity. Let's reiterate the issues we get:
- Potential name clashes and interaction with 3rd party libraries
A client program usually doesn't just rely on a single library - any additional library increases the probability that another library also adds something to an already modified class and causes a name clash to happen. Even if that doesn't seem likely for some reason - opening very basic classes of the Ruby standard library definitely poses a problem. Some solutions require opening
Object- in that case, every subclass of it has the added method(s). This is a bigger problem than clashing with another monkeypatching library. Why? Because every class in the system is derived from Object, thus the added method is now in the name space of every class. So... unless the added method has a name that includes, say, a SHA-1 hash value, there's a chance it'll clash with another method.
- Compatibility with future Ruby versions and stdlibs
A recent and high profile example is the
Symbol#to_procmethod - libraries added this to allow for a special, terse syntax for certain operations. In Ruby 1.9, this was added to the standard Ruby stdlib's
Symbolclass. This shows another source for name clashes: if a name is general enough, a future Ruby version might include the method too. While it might be fine if the method does the exact same thing - it's a problem if it doesn't. In that case, redefining the method can break the system, i.e. the Ruby stdlib and all it's clients that rely on the behavior of the standard Ruby library.
How to avoid Open Classes by designOne reason to open a class is to make objects of a class support a certain protocol or interface, i.e. a set of messages/methods (read a longer explanation of the term protocol in the context). Here are alternative solutions to achieve this.
AdaptersThe Adapter pattern basically allows - given some object X - to look up another object which supports a certain protocol which can act on behalf of object X.
An example of widespread application of the Adapter pattern can be found in Eclipse, where it helps to keep the platform extensible and modular. An example of the use of Adapters: getting an Outline GUI for an Editor:
OutlinePage p = editor.getAdapter(OutlinePage.class);The class of the
editorobject can either directly return an OutlinePage object that knows how to display an outline for the editor's content. If this particular editor doesn't implement Outline functionality, the protocol of the
getAdaptermethod suggests to forward the call to a central lookup system. Which brings us to the extensibility/modularity part: even if the creator of the vendor hasn't supplied an Outline GUI, another Eclipse plugin can provide one. Advantage of the Adapter pattern: no need to modify domain classes to add functionality - the adapter logic contains all the logic to adapt from the desired interface to the original object. Allows orthogonal changes to the interface. No global changess necessary. For more information about Eclipse version of the Adapter pattern, read "What is IAdaptable?" by Alex Blewitt.
An example of the use of Adapters in a dynamic language comes from ZOPE. In his presentation "Using Grok to Walk Like a Duck", Brandon Craig Rhodes describes the experience of building ZOPE over the years, and explores the pros/cons of different approaches how to "make an object which is not a duck behave like a duck". The solution describes several ways of defining and providing Adapters.
These Adapter implementations might seem like overkill in small applications, they do allow to keep an application modular. There's a difference to Open Classes, because the returned Adapter is not necessarily the same object as the adapted one - with Open Classes (or Singleton classes - see next section) it's possible to add behavior to all objects of a particular type. Whether that's a crucial feature or not depends on your application.
Singleton ClassesRuby allows to modify the class of one particular object. It does so by creating a new class, a Singleton class, from the object's original class. Here how to do this:
a = "Hello"The effects of these changes are kept local to the object - no other classes or objects are affected. For more information and examples on when to use Singleton classes, see InfoQ's article "Using singleton classes for object metadata".
a.foo # returns "foo"
How to safely use Open ClassesIf you really need to open a class, here some tips to reduce the risk. Jay Fields lists different ways of adding methods to classes. The solutions are
- Close on an unbound method
- Extend a module that redefines the method and uses super
Finally, collect class extensions in one location, eg. putting them all in a file
extensions.rb. By sticking to this convention, all extensions are instantly visible to anyone reading the code without requiring any special IDEs or class browsers that show where methods come from. It also acts as documentation of what classes are affected.
Safe approaches to Open Classes in Ruby and other languages
The idea of extending existing classes isn't unique to Ruby. Other languages have support for similar features, and some have found solutions that don't pollute the global namespace.
One concept is called Classboxes. Implementations are available for Squeak Smalltalk, but also Java or .NET. The basic idea:
Classical modules systems support well the modular development of applications but lack the ability to add or replace a method in a class that is not defined in that module. But languages that support method addition and replacement do not provide a modular view of applications, and their changes have a global impact. The result is a gap between module systems for object-oriented languages on one hand, and the very desirable feature of method addition and replacement on the other hand.
To solve these problems we present classboxes, a module system for object-oriented languages that allows method addition and replacement. Moreover, the changes made by a classbox are only visible to that classbox (or classboxes that import it), a feature we call local rebinding.
C#'s extension methods provide another way to approach the problem. Unlike Open Classes in Ruby, extension methods don't change the actual classes. Instead, they're only visible for the source that defines the extension methods - in short: it's really all implemented in the compiler. An example (from the linked article):
public static int WordCount(this String str)As you can see, the method gets the
thispointer to the object it's working on. To make the extension visible:
using ExtensionMethods;And done - you can now use the new method:
string s = "Hello Extension Methods";The benefit of this approach: the extension method is only visible in code that explicitly imports it.
int i = s.WordCount();
The debate about Monkeypatching/Open Classes has already caused experiments with workarounds. This workaround by coderr allows to wrap code in a context which keeps the extensions local. One issue with this solution is that it requires use of a Ruby native extension to hook into the Ruby interpreter (it uses RubyInline, look for
inlinemethod calls to find the C code that does the work).
A different approach is Reginald Braithwaite's Rewrite gem. It uses ParseTree to get the AST for Ruby code and uses this to make added methods visible in certain contexts. InfoQ discussed the Rewrite gem in more detail before. The Rewrite gem also relies on native extensions (in this case ParseTree) to work.
ConclusionWe saw the Open Class feature can cause problems when used carelessly - a fate it shares with for-loops, dynamic memory allocation and many more language features. Of course the emphasis is on carelessly. We saw what issues like name clashes really do occur in the real world. With this in mind, we looked at alternative solutions to modifying existing classes (using Adapters) - and if that's not an option, how to use Open Classes as safely as possible. Finally, less intrusive solutions to Open Classes have been considered for future versions of Ruby - for an idea of how they might look we looked at solutions in other languages (Classboxes, C# extension methods, etc).
I take away the opposite lesson from the Symbol#to_proc incompatibility problem. What I take away is "use open classes (within reason) because it will give you great functionality that you need without waiting for some official vendor to support it".
Don't get my wrong. Alternatives should certainly be considered. Open classes are not the solution for every problem and your article lists and explains some great alternatives. But don't be scared of open classes either.
Rails popularized the Symbol#to_proc method and it may not have been included in Ruby 1.9 and backported to Ruby 1.8 had Rails not popularized it. Symbol#to_proc is an extremely useful method that has saved countless developers countless snippets of time. By having open classes Rails was able to include this functionality YEARS before some official vendor supported it. Even once the vendor did support it there was a slight incompatibility which was resolved. Now each case is different. This time the issue was resolved by the official vendor changing their code. Maybe next time the application (or framework) will need to change it's code to get the issue resolved. But either way we traded countless amounts of time saved for just a few moments of "hmmm.... I guess we need to change this to that". Sounds like a big win for me.
Stop being afraid of the dynamic language and embrace it. Have good testing to catch issues like this before deployment and then love the benefits you reap. Don't go back to the old non-agile way of doing things where you write reams of extra convoluted code just to ensure you are not affected by some unlikely issue that might happen in the future.
You are going to have compatibility problems no matter how careful you are. So just use common sense and deal with anything when it comes.
Overstated as Written
Only for "clients" in that process or that require the offending library. The way it's written - and with system in italics - it sounds like you can hose the standard library for all Ruby applications running on a box by using open classes.
Patch like a Monkey
There is also a "monkey" package for Python which only applies the patch if a hash signature of the patched method is unchanged, which is an extra careful way of patching so that the developer can be alerted when the upstream source being patched has changed (pypi.python.org/pypi/monkey).
Monkey patching was originally called "guerilla patching" (a term started at Zope Corp.) because a developer who disagreed with a choice made upstream could apply their own behaviour without needing to have the patch reviewed and applied upstream. Guerilla patching was misheard as Gorilla patching and from that the term Monkey patching was invented to sound "less forceful than a gorilla". Zope developers still tend to use this term to only apply to opening classes to fix or extend them to address undesirable behaviour upstream - other use-casses for re-opening a class is usually just called "dynamic class modification". However, in the larger Python and Ruby communities this distinction isn't usually made and Monkey patching usually refers to any re-opening of class, regardless of intent.
The adapter pattern developed in Zope (and the convention-over-configuration technique applied by Grok which avoids the need for XML sit-ups) rocks, and is better alternative in many cases than the venerable monkey patch. Syntactically Python adaptation is much more concise than the Java equivalent :)
OutlinePage p = editor.getAdapter(OutlinePage.class);
p = IOutlinePage(editor)
BTW, it's Zope and not ZOPE, which is akin to saying RUBY or JAVA, all-caps are yucky :P
Re: Overstated as Written
Re: Eric Anderson
BTW: experimentation like this is fine enough - that doesn't mean that you want to put something like this in a library. We recently had a presentation on how Mingle was built:
One thing that I noticed was the liberal use of Open Classes - something which is fine if you build an application which is _never_ going to be part of something else.
However: if you release a library, you have no idea at all where it'll end up and with which other libraries it'll share an address space.
How would you like it if you just spent a day or two tracking down a problem... only to find out that developers of library A had to slap some odd method on Object, just to save a few keystrokes.
There is, however, the issue that it's very easy to use open classes... and it's very easy to ignore the very real issues.
Mind you: there are ways to avoid these issues _AND_ reap the benefits - I gave a few solutions in other languages (Classboxes, Extension Methods), and there are more which come as a neat side effect of strict enforcement of modularity: eg.
Or watch Gilad Bracha's talk on Newspeak:
So, yeah: use Open Classes... just don't patch like a monkey...