Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Understanding Legacy Code with Characterization Testing

Understanding Legacy Code with Characterization Testing

Alberto Savoia has written a series of four articles describing characterization tests, which are essentially unit tests added after the fact to help describe and document legacy code. The name "characterization testing" was coined by Micheal Feathers in his book "Working Effectively with Legacy Code", as described in Alberto's first article:
Michael Feathers defines characterization tests as tests that characterize the actual behavior of a piece of code. In other words, they don’t check what the code is supposed to do, as specification tests do, but what the code actually and currently does... Having a set of characterization tests helps developers working with legacy code because they can run those tests after modifying their code and make sure that their modification did not cause any unintended or unwanted changes in functionality somewhere else.

But where does a developer begin - and end - writing characterization tests?

Michael Feathers provides the following heuristics for writing characterization tests:
  1. Write tests for the area where you will make your changes. Write as many test cases as you feel you need to understand the behavior of the code.
  2. After doing this, take a look at the specific things you are going to change, and attempt to write tests for those.
  3. If you are attempting to extract or move some functionality, write tests that verify the existence and connection of those behaviors on a case-by-case basis. Verify that you are exercising the code that you are going to move and that it is connected properly. Exercise conversions.
In his second article, Alberto serves up a piece of confusing legacy code, and shows how he uses characterization tests to unravel its meaning:
It’s hard to fully understand what’s actually going to happen by just looking at the code; you’d have to simulate the various paths of the code’s execution in your head, keep track of variable values, etc. – tough things to do for anything but the most trivial code. Fortunately, as you will see, with characterization testing we don’t have to do that. We don’t look at the code to gain complete understanding of what it does; we look at the code for clues and suggestions on what to test.
In the third article, Alberto uses his sample characterization tests to fix a bug, and to refactor the code to make it more easily understood. Also, Alberto alludes to the realization that using code coverage tools can be especially helpful with characterization testing, in that they reveal whether or not the characterization tests are in fact "characterize" the system - if the characterization tests are missing paths through the legacy code, then either the code is dead, or the characterization tests aren't truly describing all of the existing behavior.

It may seem that writing a base set of characterization tests could be done automatically, and Alberto has created a tool which does just that - JUnit Factory, an online characterization test generator (Eclipse plug-in available). In his fourth and final article, Alberto uses JUnit Factory to generate characterization tests for the example code, and then compares it with those tests which he wrote himself. The conclusion? Automatic generation of characterization tests provides developers with an excellent starting point for understanding and working with legacy code.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Sorry to disappoint you but the series is not over yet :-)

    by Alberto Savoia ,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Kurt,

    Thanks a bunch for covering and summarizing my blogs on characterization testing.

    I am writing this because you've got everything right except for the fact that the fourth blog is not the final one and I wanted to set the record straight. I plan to write at least a handful more since characterization testing is an under-covered technique that hasn't received the attention it deserves. And it needs a lot of attention given the rapidly growing amount of legacy code out there.

    Thanks again for the coverage.


  • Re: Sorry to disappoint you but the series is not over yet :-)

    by Kurt Christensen,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Ha ha... That's what I get for not contacting you before publishing! Thanks for the articles; they provide a great overview of characterization testing. Please continue showing examples - seeing the creation of characterization tests, in conjunction with code coverage tools, really helped clarify (for me, anyway) the distinction between unit tests written for new code vs. unit tests written to work with legacy code.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p