TDD Opinion: Quality Is a Function of Thought and Reflection, Not Bug Prevention

In a recent post, Michael Feathers, a trainer, mentor and consultant, argues against the widely held idea that unit testing, by itself, improves code quality. Michael talks about unit testing, integration tests, TDD and Clean Room Software Development, concluding that code quality is a function of thought and reflection, not bug prevention. Steve Freeman, an independent consultant for M3P, develops Michael's idea further, talking about the "cognitive justification for TDD" and explaining why TDD is helpful.

Michael considers the idea of improving quality by catching bugs during tests as flawed:

One very common theory about unit testing is that quality comes from removing the errors that your tests catch. Superficially, this makes sense. Tests can pass or fail and when they fail we learn that we have a problem and we correct it. If you subscribe to this theory, you expect to find fewer integration errors when you do integration testing and fewer “unit” errors when you do unit testing. It’s a nice theory, but it’s wrong. The best way to see this is to compare unit testing to another way of improving quality – one that has a very dramatic measurable effect.

To prove his argument, Michael talks about Clean Room Software Development, a development method which was used during 1980s. Clean Room did not contain any unit testing according to Michael:

The notion behind Clean Room was that you could increase quality by increasing the rigor of development. In Clean Room, you had to write a logical predicate for every little piece of your code and you had to demonstrate, during a review, that your code did no more or less than the predicate described. It was a very serious approach, and it was a bit more radical than what I just described: another tenet of Clean Room was that there was to be no unit testing. None. Zilch. When you wrote your code it was assumed correct after it was reviewed. The only testing that was done was stochastic testing at the functional level.

What was interesting was that Clean Room produced results, raising the quality of the code without doing any unit testing. What happened with Clean Room and happens with TDD is similar: the developer is forced to constantly review, refactoring and improve his code. Michael's conclusion is:

My point is that we can't look at testing mechanistically. Unit testing does not improve quality just by catching errors at the unit level. And, integration testing does not improve quality just by catching errors at the integration level. The truth is more subtle than that. Quality is a function of thought and reflection - precise thought and reflection. That’s the magic. Techniques which reinforce that discipline invariably increase quality.

Starting from Michael's post, Steve Freeman, an independent consultant for M3P, develops the idea further, talking about the "cognitive justification for TDD". Developers make some decisions which impact their code:

It turns out that people don’t actually spend their time carefully working out the trade-offs and then picking the best option. Instead, we employ a “first-fit” approach: work through an ordered list of learned responses and pick the first one that looks good enough. All of this happens subconsciously, then our slower rational brain catches up and justifies the existing decision—we can’t even tell it’s happening.

The idea is to stop using the first solution that comes to our mind, and start evaluating various options, and that's why TDD is helpful according the Steve:

Test-Driven Development works (or should do) by breaking our first-fit pattern matching. It stops us being expert and steam-rolling over the problem with, literally, the first thing that came into our minds. It forces us out of our comfort zone long enough to consider the real requirements we should be addressing. Even better, starting with a test forces us to think first about the need (what’s the test for that?), and then about a solution that our expert mind is so keen to provide.

Steve gives an example to prove his point:

The best supporting evidence is Arlo Belshee’s group that implemented Promiscuous Pairing. They found empirically that they were most productive when switching pairs every couple of hours, contrary to what anyone would expect; their view was that were taking advantage of constantly being in a state of “Beginner’s Mind”.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Agile topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter