Key Takeaways
- The testing diamond didn't address the problems of the testing pyramid. In fact, it avoided the problems caused by misinformation about unit tests.
- Opaque-box tests are not exclusive to testing through public interfaces of the system. A system is composed of many boundaries, and all of them benefit from behaviour-focused tests.
- By avoiding clear-box testing, the need for heavy mocking and public interfaces will drop significantly. This leads to a more maintainable test portfolio.
- Avoid publicly accessible code at all costs. The less code you have that is accessible, the easier it is to maintain, evolve, and refactor your code.
- Build architectures with a testing strategy in mind. How easy it is to test them will dictate the success of the architecture.
It was 2014 when David Heinemeier Hansson set the Software Development world on fire. He was on a RailsConf stage when he proclaimed that "TDD is Death".
It was a bold move. But he was the leader that many unhappy with testing were looking for. Many followed along, splitting developers into two camps.
That moment was the epicenter of a new wave. A wave that took us to today, where unit tests are losing importance in favor of integration tests.
When the famous Testing Pyramid by Mike Cohn is now reshaped as a diamond.
It's impossible to find a single reason for this movement, but it's easy to find many behind the discontentment with the existing testing practices.
This happens when practices are spread like dogma, lack proper guidance, and are rooted in abstract thinking.
Everyone starts by doing their best. Trying, failing, and trying again. Until the moment that someone breaks the chain and presents a different path. A path to a promised low-maintenance test suite.
What Is the Best Direction?
One thing I've learned in this industry is that even being a new field, we quickly forget our history. The rapid pace makes us believe that the past has no answers and that the future has many great things to unveil. I can't argue with the future perspectives, but I can tell you that our first tendency is to look for innovation instead of information.
Probably, the following questions would avoid many feeling the need for a Testing Diamond as a replacement for the Pyramid:
- Is this problem caused by unit tests or how I write unit tests?
- Am I applying integration testing to components that need it?
- Did I misunderstand anything that led me to the same assertions in multiple places?
- Am I improving my design through tests or testing around the existing design?
Getting Back to Our Roots
Likely the answer is once again hidden in the past.
So, what does history tell us about integration tests? Historically, integration testing was the stage when different development units were tested together. Those units were developed in isolation, often by multiple teams. That was the phase when we guaranteed that the defined interfaces were well implemented and worked accordingly.
Nowadays, we see integration tests applied to code units developed by the same team. This implies that each source code file is a system boundary. As if each code file had been developed by an autonomous team. This is blurring the lines between unit and integration tests.
Based on that, we could reason that the distinction between integration and unit tests was rooted in a mistake. The idea that integration tests are for testing between teams and unit tests are for testing within a team is the wrong distinction. We were fixing a problem caused by us.
What we should do instead is define clear boundaries. Not layers, but boundaries between each development team. Those boundaries will give you a perspective on the system's role and how it interacts with other domains. This is similar to how Alistair Cockburn describes the Hexagonal Architecture, also known as Ports and Adapters. In his work, he describes a system as having two sides. The internal and the external ones. Now, we need to bridge those two sides through well-defined boundaries.
How does that help? It is this internal/external relationship that makes it clear the relationship between unit and integration tests. The unit tests are responsible for testing the boundary from an outside-in perspective. While the integration tests will test the boundary from an inside-out perspective. In concrete words, we can say that integration tests ensure the correct behavior of the Adapters, Gateways, and Clients that mediate the relationship with other development units (such as APIs, Plugins, Databases, and Modules).
Behavior Focused Testing
What does the unit in unit tests mean? It means a unit of behavior. There's nothing in that definition dictating that a test has to focus on a single file, object, or function. Why is it difficult to write unit tests focused on behavior?
A common problem with many types of testing comes from a tight connection between software structure and tests. That happens when the developer loses sight of the test goal and approaches it in a clear-box (sometimes referred to as white-box) way.
Clear-box testing means testing with the internal design in mind to guarantee the system works correctly. This is really common in unit tests. The problem with clear-box testing is that tests tend to become too granular, and you end up with a huge number of tests that are hard to maintain due to their tight coupling to the underlying structure.
Part of the unhappiness around unit tests stems from this fact. Integration tests, being more removed from the underlying design, tend to be impacted less by refactoring than unit tests.
I like to look at things differently. Is this a benefit of integration tests or a problem caused by the clear-box testing approach? What if we had approached unit tests in an opaque-box (sometimes referred to as black-box) (behavioral driven) approach? Wouldn't we have reached similar or even better results?
A common misunderstanding is thinking that opaque-box testing can only be applied to the outer boundaries of our system. That is wrong. Our system is built with many boundaries. Some may be accessible through a communication protocol, while others may be extended with in-process adapters. Each adapter has its own boundaries and can be tested in a behavioral-driven approach.
Mocking: All or Nothing
On a clear-box testing approach, there's often heavy use of mocks. But when you overuse mocks, tests become harder to maintain. Maybe that is what Mark Seemann refers to when he says that stubs and mocks break encapsulation.
Once you start facing this kind of problem due to heavy mocking, it's normal to start to hate mocking. So, you try to avoid it at all costs. An API-only testing approach will commonly lead to the need for heavy mocking.
Once again, I question whether it was a problem due to mocking or misusing mocks.
Mocks and stubs may be harder to maintain, but they exist for a reason. They have a valid role to fulfill in making tests faster and more stable. It's our responsibility to control them. We don't want to overuse them beyond where they are essential.
Reduce the Public Surface of Your Code
Another side effect of clear-box testing is that it leads to exposing more code than needed. Validators, mappers, and other pieces of code that could be internal implementation details are now part of the public contract just because we exposed them for the sake of testing. Also, anyone working in the Java and C# world knows how prevalent Interfaces are in their codebases. Once again, for the sake of testing. To mock a dependency, the developer might introduce an Interface.
Once a piece of code is accessible from the outside, it becomes harder to change, and tests become required. This will lead to code where maintainability is a problem and refactoring is almost impossible without rewriting a ton of unit tests.
On the surface that looks like an argument in favor of integration tests, since integration tests focus on the outer layer where many of these implementation details don't leak.
Once again, I ask is it a problem with unit tests, or is it a problem with the way we are implementing unit tests? If we implement unit tests in an opaque-box way, ignoring the internal design decisions and only being concerned with what consumers need, it will lead to a smaller contract. A contract that is easier to test, with fewer tests, and tests that are easier to maintain.
Architecture as the Guiding Principle
Tests tend to grow around architecture. We design our systems, then we think about testing. When we do that, systems can become harder to test. We have seen that happen with multi-layer architectures, where the dependency on data access technology brings complexity when unit testing the domain layer.
That can easily be avoided by adopting an architecture with test isolation in mind. From Hexagonal Architecture to Clean Architecture, we have many options from which we can pick.
This type of architecture is built to be device independent. All infrastructure dependencies are plugged into the system through dependency configuration. This type of architecture will make unit testing comfortable and lead you to use integration tests for what they should be: testing adapters to the outside world.
Integration testing adapters only introduce a weak spot into our testing strategy. When you integration test with all the components connected, you gain the advantage of testing things like configuration and composition. We obviously want to test that. We can still run tests with all components connected. The difference is that those become "smoke tests" and don't need to test every single corner case. That will lead to more stable and reliable tests.
Conclusion
It is as important to question old beliefs in the industry as it is to know them well before starting to question them.
We know the past repeats itself. We know that the past also informs our decisions about the future. We should also know that we will inevitably make the same mistakes over and over again. It's human nature, so it's up to us to avoid doing it.
Testing strategies are one of those cases where we tend to repeat our mistakes. We are addressing the pain caused by a lack of good information and education while avoiding the existing good practices.
Testing and architecture are deeply connected. It's up to us to design architectures with testing in mind. And unit testing will still be a tool we use in our pursuit of good testing strategies.
Community comments
Some more thoughts about unit tests and integration tests
by David Karr,
Re: Some more thoughts about unit tests and integration tests
by Guilherme Ferreira,
Some thoughts
by Volodymyr Aleksandrov,
Re: Some thoughts
by Guilherme Ferreira,
Some more thoughts about unit tests and integration tests
by David Karr,
Your message is awaiting moderation. Thank you for participating in the discussion.
In my experience, unit tests don't work as well as they should because management pushes the idea that the purpose of unit tests is to get test coverage, because business emphasizes measurable goals. Unfortunately, with that goal in mind, instead of "verifying that the code under test has done what it's supposed to do", along with inexperienced developers, the vast majority of the unit tests in our organization are what I call "code exercisers", which do run the code under test, but do little to no verification (not counting "not null" tests, which seems to be what they think is what a test is supposed to do), but do get that all important test coverage.
Something else that contributes to this tendency is having the program office directly drive development, which means it is entirely feature-driven, with little to no quality governance.
Concerning integration tests, in theory they are very important, but it's very difficult to manage the data. If you need to have integration tests that can be run on every build, you have to have data that works for every build. As a result, integration tests are de-emphasized until they only test trivial integrations.
Some thoughts
by Volodymyr Aleksandrov,
Your message is awaiting moderation. Thank you for participating in the discussion.
In my opinion, unit tests allow a better code structuring. Having integration tests prevalence, we can have parts of our code implemented via long-lines functions with different responsibilities and other SOLID principles violations that lead to low maintenance code too.
Re: Some thoughts
by Guilherme Ferreira,
Your message is awaiting moderation. Thank you for participating in the discussion.
That's a good point. A behavior hard to test might be a design improvement opportunity.
We can see that in practices like TDD, where we can drive the design through tests.
Re: Some more thoughts about unit tests and integration tests
by Guilherme Ferreira,
Your message is awaiting moderation. Thank you for participating in the discussion.
Absolutely. Often organizations use the wrong incentives to drive the testing strategy. Using vanity metrics like Code Coverage can lead to false expectations. Developers can deliver fulfilling the metric, but by the end, you don't collect the expected result due to poor test quality.