Test Driven Development and the Trouble with Legacy Code

Complex Maze Allan Baljeu was trying to TDD with a legacy C++ code base, he was running into trouble because:

we end up with classes that don't fully implement the functionality that's eventually needed, and when others come around to use those classes, and eventually fuller implementations are required, then it turns out that the original design is not adequate, a new design is required, some expectations (tests) need to change and previous uses of the class need to be updated.

He wondered if Big Design Up Front would help solve the problem. George Dinwiddie, Agile Coach, suggested that Alan’s design was trying to tell him something. You have to pay attention to the fundamentals of clean code. You can look at basic coupling and cohesion (i.e. SOLID).

Mike “Geepaw” Hill, Agile Coach, says that in his years of coaching agile teams, one of the following has been at the root of these problems:

team is not yet up to speed on refactoring, so your classes aren't really
minimal
team is not yet skilled at simplicity, so ditto
team is not yet doing aggressive & rapid microtesting (aka unit testing), so changes break tests too often
team doesn't know how to handle cross-team or company-to-public dependencies, e.g. shipping api's
team neither pairing nor open workspacing, dramatically slowing team-wide understanding.
team likely has no jiggle-less build
team could be using tools from the '40s

Keith Ray, XP Coach, suggests that with legacy code (i.e. systems with high technical debt) the cost of repaying technical debt dominates the cost of implementing a story. He goes on to offer an approach:

To make the code more well-factored (paying down the technical debt), whenever you need to integrate a new feature into it, you should pay close attention to code smells in both the new code and the old code and consider refactoring to deal with each smell as you recognize it.

You can do refactorings in small safe steps (even in C++) manually. Very closely follow the instructions in Fowler's book on Refactoring until you learn them by heart. Eclipse with gcc has a few refactorings that actually work: Extract Method and Rename. Rename understands scope, so it is safer than search-and-replace. Extract Method and the other refactorings in Ecipse might be buggy, so be careful when you use them. For things like changing a function signature, "lean on the compiler" to show where changes have to be made.

You also need tests to make sure the refactorings are not damaging the existing features. Feather's book on working with legacy code has lots of techniques for adding tests to legacy code. On a higher level, code smells are violations of good design principles. For example, the Single Responsibility Principle (SRP) says there should one purpose for every class / method / module. There are principles about coupling and cohesion and managing dependencies, etc. It's often easier to detect a code smell than it is to apply these abstract principles. "Large Class" and "Large Method" are remedied by "Extract Class" and "Extract Method/Move Method", though knowing SRP helps in deciding what parts of a class or method should be extracted.

Perhaps the most important design principle is "Tell, don't ask": keep functionality and data together.... bad code often has the functionality in one place, and gets the data it needs from other places, creating problems with dependencies and lack of locality -- symptomized by "adding a new feature requires changing lots of code". The code smells "Shotgun Surgery", "Feature Envy", "Long Parameter List" are applicable here.

Getting fast feedback will allow more refactoring, which will (eventually) allow faster development of new features. Try to get parallel builds happening (distributed compilation). Try to get smaller source files and smaller header files. Reduce the complexity of header files - use forward declarations, avoid inline code, try to keep only one class per header file / source file. Using the "pimpl" idiom widely can decrease compile time by 10%, but it can also disguise the "Large Class" and "Feature Envy" code smells.

The advantage of refactoring instead of rewriting, is that you always have working code. If your manual and automated tests are good, then you should be able to ship the code, even if it is a half-way state between a bad design and a good design.

Keith also wrote “Refactoring: Small Steps Guaranteed to Help You Clean Up Your Code” an article on refactoring C++ code in Better Software Magazine.

Previously on InfoQ: Dealing with Legacy Code, Uncle Bob On The Applicability Of TDD and Making TDD Stick: Problems and Solutions for Adopters

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Architecture topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter