Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Testing Misconceptions

Testing Misconceptions

The opinions expressed in this article are those of Liam O'Connor and are not necessarily those of his employer (NICTA).

If you're anything like me, you've probably been exposed to an enormous amount of articles advocating Test-Driven Development (TDD) or other development practices involving extensive testing, both on the unit and integration tests level. I believe that many advocates of these practices lack the experience in real-world projects to make their arguments credible. In fact, these extremely rigorous testing practices often don't work at all when scaled to larger projects.

In this article, I'll explain some of the common misconceptions about testing. If you write your tests with these in mind, I hope that it will help you and your team to decide when it is appropriate to test, and when it isn't.

Misconception 1: Tests show my code is correct!

While this misconception appears true intuitively, you cannot actually rely on tests to establish any form of rigorous correctness standards. When you write a test, you have tested one possible scenario in your program. Many units in your program may have an infinite (or intractably large) number of possible scenarios to test. It is not feasible to test them all - so the typical response is to test some failure cases, edge-cases and maybe a couple of "regular" cases just to make sure everything is all right.

This is barely a drop in the ocean if your goal is correctness. It's fairly easy to develop a suite of tests that always passes, despite the presence of bugs. There are some bugs that are essentially impossible to detect via tests - race conditions and other errors involving concurrency are classic examples where, even if you had control over the scheduler, the number of possible interleavings grows so rapidly that it quickly becomes impossible to test reliably.

So, tests do not show correctness for all but the most trivial of cases, where the test can fully specify the behavior of the unit. In these trivial cases, often it is not worth writing the tests in the first place; these cases are trivial precisely because the code they are testing is trivial! The only thing that is achieved by writing tests for trivial pieces of code is increased maintenance overhead and added workload for testing machines.

Seeing as tests are just code, you can also have bugs in your tests. If the person writing the test is the same person writing the code, often they may implement a unit incorrectly, and then write a test that ensures that incorrect behavior. The root of the problem is the developer misunderstanding the specification, not a minor mistake in implementation.

If you really need correctness, then formally verify your code (the tools for verification are much better these days than in the past). If you don't need correctness, write tests. Always keep in mind that the tests you are writing merely serve as a smoke alarm for a fire, but can't detect a whole variety of other problems.

Misconception 2: Tests are executable specifications!

This is false for a few reasons. Let's look at my dictionary's definition of specifications:

A set of requirements defining an exact description of an object or a process.

Therefore, if my code conforms to specifications, it should be completely correct, as the specification exactly defines the behavior of the code. If my tests are specifications, they must therefore establish correctness. As we've already discussed, they do no such thing, hence they're not specifications.

Somewhat less pedantically, if we assume that a developer could infer from reading test cases the desired behavior of a function, then we introduce a whole lot of imprecision; if test cases are not extensive enough, we could end up inferring the wrong thing, sometimes only subtly different from the desired behavior.

In addition, test-cases are not checked for consistency. This means that your tests could actually "specify" an undesirable behavior as a result of developer error or misunderstanding. This could lead to contradictions in your tests, and therefore your specification.

Randomized testing software such as QuickCheck allow you to write tests simply as Boolean properties that should hold, and the test cases are generated for you by the software. This software allows tests to come much closer to executable specifications, however the properties are still not checked for consistency.

Misconception 3: Tests lead to good design!

While making a bad design testable does have potential to improve it, testing is not a replacement for good design practices. Writing huge suites of tests against the interfaces of a system increases the "work investment" that developers have put into those interfaces. The problem arises when these interfaces are no longer optimal, i.e., developers have already written huge amounts of tests for these interfaces. Changing the interfaces would mean changing all the tests as well. Tests are tightly coupled to those interfaces, so most of those tests will have to be thrown away and rewritten. Seeing as most developers grow attached to their work, this can lead to suboptimal design decisions hanging around even if they aren't the best fit, well into the lifetime of the project.

The solution here is to start testing only after you've written a series of prototypes. Don't bother testing something you are likely to refactor heavily very soon. All it does is increase workload on developers and testing machines, as well as cause developers more pain when they have to destroy hours of work when requirements or interfaces change. If you don't wait for testing, your tests can actually lead to bad design, as developers will be reluctant to do any major refactoring.

In addition, making code testable is hard. Often people resort to questionable design decisions just to make testing easier; exposing leaks in abstraction, tying mocks too heavily to implementations of interfaces, or writing test cases that are full of so much code that they almost require tests themselves (mocks and stubs often suffer from this problem).

Misconception 4: Tests make it easier to change code!

Tests do not always make changing code easier, however if you are changing underlying implementations of interfaces, tests can help catch regressions or undesirable behavior in your new implementation. If you are changing the higher-level structure of the program, however, the opposite is generally the case. Tests are often tightly coupled to higher-level interfaces. Changing these interfaces means rewriting the tests. In that case, you've made your life harder - you will have to rewrite the tests, adding more work, and the old tests do nothing to ensure you haven't introduced a regression, meaning the tests haven't helped at all.

So, don't write tests?

I am not saying that you should not write tests. Tests are a valuable way to improve confidence and prevent regressions in software. They do not, however, uniformly lead to good design, correctness, technical specifications, or effortless refactoring for the reasons outlined above. Using tests in excess makes development *harder*, not easier.

Similarly, not verifying code at all makes quality assurance impossible, but rapid prototyping easy. Testing introduces a trade-off between quality assurance and flexibility so a suitable compromise must be struck.

About the Author

Liam O'Connor formerly worked for Google, and teaches at the University of New South Wales. He recently began working for NICTA, Australia's leading ICT research institution, on the l4.verified project: the formal verification of an operating system kernel.

Rate this Article