BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Interviews Kevin Rutherford on Refactoring Ruby And Code Smells

Kevin Rutherford on Refactoring Ruby And Code Smells

Bookmarks
   

1. Kevin, who are you?

Hi, I’m an Agile coach from the Manchester area in the UK. I’ve been developing software for 30+ years and right now most of what I do is working with usually small teams helping them to be great.

   

2. What does an Agile coach do?

I guess you get a different answer from every Agile coach. What I do is I join the executive team, I join the management team and I work with everybody from developers, executives and everybody in between, basically pairing with everyone, showing them different ways of doing what they do, showing them how the overall goal of their organization is affected by the things they do on a minute-to-minute basis.

And helping everybody line up and work towards that goal in a more effective and efficient way. Some of that is management practices, some of that is team practices, some of it is personal practices, some of it is how they think about their code, how they write their code, the tools they use - it’s the whole thing.

   

3. Do you subscribe to any Agile methodologies or do you use general knowledge? How does that work?

It’s whatever works best for the team, the tools, and the circumstance. I visit a lot of teams who have just tried Scrum or XP and had not much success with it for example. It doesn’t really matter which brand of process you are using. The important thing is to deliver valuable software rapidly.

   

4. You wrote a book about Refactoring in Ruby. How does that work? I think everybody knows you can’t refactor in Ruby.

Ruby makes it slightly harder because in a language like Java for example, which is the language that Martin Fowler was using when he wrote the original refactoring catalogue, you have a lot of clues in the code as to what the types of things are. The code is telling you very explicitly about things that are wrong, whereas in Ruby all of that is inferred by the Runtime.

As a developer you don’t necessarily see all that sort of stuff in your face. One thing that means is that it’s even more essential in Ruby to have a very good test suite because there is no compile step to help you, there is no compile step to catch problems early.

It’s absolutely essential to have a really good set of tests and good test coverage. But other than that, refactoring is just as easy, just the same. The same general techniques apply.

   

5. What’s your book about? Is it a translation of the original book to Ruby? Does it add something? Does it remove something? What is it like?

I worked with Bill Wake who wrote the refactoring workbook back in 2002, which is a Java workbook to help you understand and practice refactoring, but it’s a completely new book. The code smells in Ruby are slightly different, the refactoring steps are slightly different, the type cues in the language are different.

We rewrote the whole thing from the ground up. What the core of the book basically is, is a collection of 40 odd code smells, how to recognize them, how they got there, and what to do about them and over 100 exercises where you get to look at some code and you get to think about some code and practice refactoring, practice recognizing code smells.

It’s a practical handbook for the crucial step of writing Agile code, which is learning to recognize bad code and then learning to fix it.

   

6. What’s a code smell? What granularity is a code smell? How big is it or how small?

They vary. At the smallest level you’ve perhaps got an uncommunicative variable name or parameter name, a variable named X or str2 or something. At the other end of the scale you have things like shotgun surgery which we see everywhere, which is a new requirement comes in and in order to implement that requirement you have to change 30 classes.

Because a single piece of knowledge or responsibility has been distributed rather than being held and encapsulated in one place. The scope is the whole range possible.

   

7. Essentially you teach the reader to detect these code smells and give them ways to fix them.

Yes. Those fixes aren’t really recipes because it so much depends on the context you’re in. But they’re hints, they’re guides, they’re things to look out for. The main essence of the book is the practices, the actual worked exercises.

   

8. How did you find these code smells?

We had a massive start from Martin Fowler’s original refactoring catalogue, which has a chapter of code smells in it, which were collected by Fowler and Kent Beck. Then, in the 10 years since then, the community added a few more. In writing the Ruby book we added a few more as well that are fairly specific to Ruby.

It’s a community thing, most of them are well known. We renamed a few because some of the names are old and don’t necessarily apply in Ruby or in modern part of this century. But by enlarge, most of them are well known.

   

9. You wrote a tool called Reek to automatically detect code smells. How does that work?

Reek is a very naïve attempt at code smell recognition in Ruby. It doesn’t do very well because it doesn’t have enough type information usually but it has a few heuristics like what constitutes a bad name - anything with one character or anything ending in a number or a few other rules like that.

It detects FeatureEnvy for example by scanning through a method body and looking for calls to other objects and it counts those and it counts if there are more of those than there are references to other methods in the same object, then that’s FeatureEnvy. It’s really very naïve. To some extent I wrote it as a way of exploring the code smells that we were writing about.

I thought if I could figure out how to automatically detect them, then we would know we were writing about the right things. It’s continually growing, as well. I’m adding new code smell detectors every month. I have a backlog of about a dozen that need to be added, but it will never be a substitute for genuine understanding by the developer.

   

10. At the technical level how does Reek work? Does it use ParseTree or ruby_parser?

It uses ruby_parser. The original version used ParseTree, but ParseTree was EOL'ed back last year so it uses ruby_parser to take source code, produce the syntax tree. Then it runs through the syntax tree and decorates it by sticking modules into most of the nodes that allow the nodes to have behavior.

It runs over the syntax tree looking for interesting nodes, mostly class module and method definition. Each one it finds, it fires the appropriate smell detectors and they look inside the smaller bits of syntax tree and if they see a problem, they fire that back up to one of the views which are basically textual reporting views.

It writes a textual report either just in text saying "You got FeatureEnvy on line such and such" or it produces the YAML which is a much more detailed parsing of the stuff.

   

11. Is Reek extensible?

Only by me at the moment because there are some dependencies between the various bits of it that I’m working to get rid of but in the next few months I hope to release a version that allows you to plug in your own smell detectors. That’s what I’m working towards.

   

12. How do you look at the syntax tree? Do you look for the individual nodes and look at them? Do you use some kind of pattern matching?

It’s just a simple tree walk. I run through the tree once decorating it and then and then run through it again. It’s a simple depth first left-right reversal and every class or module or defn I fire out an event to the smell detectors that are listening for that.

Each smell detector is passed the sub-tree of interest and they just look inside that for the things they are interested in. One reason is it’s hard to write smell detectors at the moment is the API isn’t quite rich enough yet to make it easy to write a smell detector. You still have to know quite a lot about syntax trees to make that work.

   

13. What’s the most complex smell detector in Reek?

The one I’m working on at the moment is for data clump. Data clump is, if you look through the definition of a class or a module and you see several parameters being passed to several of the methods, there is an indication that maybe those parameters ought to actually be collected together and encapsulated as an object.

The more copies there are of the group of parameters, the more likely it is there is a missing class somewhere. Right now, the detector for that is really inefficient and on some code bases it runs out of memory. There is a fantastic project called Caliper which is run by the guys at devver.net.

Caliper looks at I think every project in every public repository in GitHub and every time a commit is made Caliper runs Reek and all the other metric through the matrix. If you go to the Caliper website, you can see up to the many statistics on every single public Git repo and they found that the data clump detector crashes on one or 2 of those.

I think there are up to 6,000 repos. For the most part it works well, but it runs out of memory in a few special cases, so that’s what I’m working on at the moment. It’s very complex to work out because the parameters don’t have types in Ruby, so I’m just using the names and trying to intuit whether it is a data clump or not. [Editor's note: the Caliper service has been shut down].

   

14. Does the output of Reek include any kind of source information or source location so that potentially a refactoring tool could modify, fix the simple code smells?

The YAML output includes source file name, the line numbers that contribute to the smell and usually that’s more than one. For example, if you’ve got a bad name the line number of every place where it’s used and also the fully qualified name of the enclosing scope where there is a method or a class or whatever plus all the details of the actual smell itself.

That was explicitly asked for by the Caliper guys and by Martin Andrews who’s developing a thing called Codeyak, which uses Reek to display code metrics. Both of those projects wanted line numbers and contextual information. Eventually I think they will be displaying marked up source code.

   

15. Reek could potentially also run inside an IDE like Eclipse or NetBeans or IntelliJ?

Originally I had a thought of doing that. It was one of the things on my original to-do list, but I never got round to it and now these guys are doing web tools that effectively do the same. So, thankfully they’ve taken that job away from me.

   

16. If there are others interested in this tool, where can they find it?

The tool is on GitHub. If you go to my account, Kevin Rutherford on GitHub http://github.com/kevinrutherford you’ll find the repo there. There are links to it; there is a website for the book, which is called http://www.refactoringinruby.info/ and there are links to it from there.

   

17. Thank you Kevin Rutherford.

Thank you.

Aug 25, 2010

BT