Understanding Legacy Code with Characterization Testing
Michael Feathers defines characterization tests as tests that characterize the actual behavior of a piece of code. In other words, they don’t check what the code is supposed to do, as specification tests do, but what the code actually and currently does... Having a set of characterization tests helps developers working with legacy code because they can run those tests after modifying their code and make sure that their modification did not cause any unintended or unwanted changes in functionality somewhere else.
But where does a developer begin - and end - writing characterization tests?
Michael Feathers provides the following heuristics for writing characterization tests:In his second article, Alberto serves up a piece of confusing legacy code, and shows how he uses characterization tests to unravel its meaning:
- Write tests for the area where you will make your changes. Write as many test cases as you feel you need to understand the behavior of the code.
- After doing this, take a look at the specific things you are going to change, and attempt to write tests for those.
- If you are attempting to extract or move some functionality, write tests that verify the existence and connection of those behaviors on a case-by-case basis. Verify that you are exercising the code that you are going to move and that it is connected properly. Exercise conversions.
It’s hard to fully understand what’s actually going to happen by just looking at the code; you’d have to simulate the various paths of the code’s execution in your head, keep track of variable values, etc. – tough things to do for anything but the most trivial code. Fortunately, as you will see, with characterization testing we don’t have to do that. We don’t look at the code to gain complete understanding of what it does; we look at the code for clues and suggestions on what to test.In the third article, Alberto uses his sample characterization tests to fix a bug, and to refactor the code to make it more easily understood. Also, Alberto alludes to the realization that using code coverage tools can be especially helpful with characterization testing, in that they reveal whether or not the characterization tests are in fact "characterize" the system - if the characterization tests are missing paths through the legacy code, then either the code is dead, or the characterization tests aren't truly describing all of the existing behavior.
It may seem that writing a base set of characterization tests could be done automatically, and Alberto has created a tool which does just that - JUnit Factory, an online characterization test generator (Eclipse plug-in available). In his fourth and final article, Alberto uses JUnit Factory to generate characterization tests for the example code, and then compares it with those tests which he wrote himself. The conclusion? Automatic generation of characterization tests provides developers with an excellent starting point for understanding and working with legacy code.
Sorry to disappoint you but the series is not over yet :-)
Thanks a bunch for covering and summarizing my blogs on characterization testing.
I am writing this because you've got everything right except for the fact that the fourth blog is not the final one and I wanted to set the record straight. I plan to write at least a handful more since characterization testing is an under-covered technique that hasn't received the attention it deserves. And it needs a lot of attention given the rapidly growing amount of legacy code out there.
Thanks again for the coverage.
Re: Sorry to disappoint you but the series is not over yet :-)
Tom Gilb & Kai Gilb Jan 26, 2015