Key Takeaways
- Technical debt is a business risk that wastes 23-42% of developers’ time. Its implications go deeper than financial waste, impacting well-being and job satisfaction alike.
- One explanation for why this waste is tolerated is because code is an abstract concept that isn’t accessible to non-technical stakeholders, making possible gains in code quality hard to translate into business value.
- Use the code health measure to quantify and communicate technical debt in an easily accessible way. Code health has a proven business impact on both time-to-market as well as the amount of unplanned work.
- Prioritize the identified technical debt based on interest rate, as determined by a Hotspot analysis. This limits the results to what’s actionable and relevant.
- Put a business expectation on code quality improvements in terms of shorter development times and reduced uncertainty in estimates.
Everyone in the software industry “knows” that code quality is important, yet we never had any data or numbers to prove it. Consequently, the importance of a healthy codebase is largely undervalued at the business level: more than 90% of IT managers lack processes and strategies for managing technical debt.
In this article we explore the impact by diving into recent research on code quality. With twice the development speed, 15 times fewer bugs, and a significant reduction of uncertainty in completion times, the business advantage of code quality is unmistakably clear. Let’s dig in and see how you use those numbers to your advantage.
Defining technical debt
The technical debt metaphor originated as a way for developers to explain the need for refactoring and trade-offs to business people. However, the term is used in a much broader sense today.
Organizations take on technical debt for multiple reasons. Maybe we try to get a feature delivered quicker by compromising code quality, or maybe our understanding of the business changed and our design no longer matches. Or maybe our design just wasn’t a good fit to start with?
So while the root causes are different, the outcome is the same: we end up with code that’s more expensive to maintain than it should be.
Quantifying the waste from technical debt
Recent research on the business impact of code quality suggests that the average company wastes 23-42% of developers’ time due to technical debt and bad code in general. As if that weren’t alarming enough, studies on Software Developer Productivity Loss Due to Technical Debt also find that developers are frequently “forced” to introduce new technical debt as companies keep trading code quality for short-term gains like new features.
However, technical debt has implications that go deeper than financial waste. Technical debt impacts developer happiness and job satisfaction alike (see the paper by Graziotin & Fagerholm, 2019, on happiness and the productivity of software engineers, for a good overview).
Few developers enjoy working with bad code that could have been avoided in the first place. Such code causes us to become stuck during problem solving with stress and time pressure following its trail. Unmitigated technical debt with high interest fuel developer attrition.
The problem isn’t limited to developers. Over the years, I’ve also seen multiple technical leaders give up and leave for greener pastures. I still remember giving a presentation on technical debt to the CEO of a well-known corporation. Halfway through my lecture, the CEO exclaimed, “Now I understand why my former CTO left”. The ripple effects of unmitigated technical debt has demoralising consequences across the whole organisation.
Are we getting half the work done in twice the time?
Having a high-performing software organization is a competitive advantage. As a company, we need to be responsive to customer needs, act on feedback, and continue to innovate if we want to stay relevant in the market. And we need to pull that off during ever shorter product cycles.
Obviously, this requires skilled software people. Unfortunately there’s this global shortage of software developers: we cannot just keep hiring more people because they don’t exist.
Given those constraints, wasting half of our capacity on technical debt payments sounds like a poor trade-off to me.
As a thought-experiment, let’s assume that 42% of our developers’ time is indeed spent dealing with technical debt. This means that if you have 100 engineers in your organization, you’d get the equivalent output of just 58 people. However – and this is key – the actual waste is significantly larger. The coordination, management and communication overhead of 100 vs 58 people is very real. As Brooks’s Law tells us, adding more people comes with a cost. Working on an over-staffed project is painful: you spend more time in sync meetings than in your code editor.
Exploring technical debt in our own codebase
The main problem with technical debt is that code lacks visibility. Code is an abstract concept that isn’t accessible to all members of your organization. Hence, it’s easy to ignore technical debt even if we are aware of the general problem. Quantifying and visualizing the situation in your codebase is key, both for the engineering teams as well as for product and management.
Visualisations are wonderful as they let us tap into the most powerful pattern detector that we have in the known universe: the human brain. I explored the concept at depth in Your Code as a Crime Scene, and founded CodeScene back in 2015 to make the techniques available to a general audience.
The visualizations we use are quick to learn. Once there, they can empower the whole organization by pointing out the strong and weak parts in any codebase. Further, the visualization lets you assess how deep any potential problems go.
Let me share an example from two popular open source projects:
In the preceding visualization, each circle corresponds to a module with source code. The color reflects the health of that code (see manage technical debt for a deeper dive):
- Green code is easy to understand and low risk
- Yellow indicates a warning space with technical debt
- Whereas red code means severe maintenance issues and a high-degree of technical debt
Visualizing the health of your code is a solid first step. However, the actual cost of technical debt is the interest you have to pay on sub-standard code. To reason about interest rates, we need to look at how the organization interacts with the code they’re building. Let’s see how it works.
Prioritize technical debt by interest rate
The interest component in technical debt can never ever be calculated from code alone. Fortunately, we can get this critical information by tapping into version-control systems like Git. In particular, we can mine the change frequency – number of commits – of each piece of code and use that as a proxy for developer impact. When combined with a complexity metric like code health, we can identify complicated code that we have to work with often. I call that intersection hotspots:
Hotspots give us the relevance dimension, whereas code health complements with a quality dimension. To manage technical debt we need both perspectives. Here’s an example from a real-world codebase:
The big win with hotspots is that they limit the information to what’s actionable. As an organization, we simply cannot refactor and re-design all code at once. And neither should we. Hotspots let us balance short-term goals like new features with the long-term sustainability of the codebase.
By striking this balance, we pay down technical debt systematically and free time for valuable innovation. These outcomes can be a real game changer. A great example is the Carterra case study, a leading biotech research company, that reported a 82% reduction in unplanned work after addressing their hotspots. By pinpointing the exact files under active development together with their associated code health, Carterra could prioritise their efforts.
The measurable business impact of code quality
With code health and hotspots covered, we have everything we need for taking it full circle. Without a quantifiable business impact, it’s hard to make the case for investing in technical debt paydowns. Any measures we use risk being dismissed as vanity metrics while the code continues to deteriorate. We don’t want that to happen.
As discussed earlier, there hasn’t been any previous data which could support the importance of high code quality. In our Code Red paper, we set out to change the situation by investigating 39 commercial codebases from various industries and domains.
The Code Red paper shows that code quality has a dramatic impact on both time-to-market as well as the external quality of the product:
- Red code has, on average, 15 times more defects than a healthy codebase. This defect density will give a substandard product experience.
- Red code comes with a substantial waste. On average, adding a feature to red code takes more than twice as long as a corresponding change to green code.
- Finally, the Code Red research found that implementing a feature in red code features can take up to 9 times longer compared to green code.
Of these findings, the most important one is the unpredictability in code of low quality. Adding a feature to healthy code seems to be a predictable process. Not so in unhealthy red code where the data indicates a significant variation where work can take up to 9 times longer. This results in uncertainty for the organization.
Let me elaborate on what such uncertainty means by considering a hypothetical company with red code. That company might be able to implement a new capability in 9 months. If their competitors have green code, they could get the same features in just one month. It’s going to be hard to keep up. Code quality is a real, measureable competitive advantage.
Code quality and speed are related
During my 25 years in the software industry, I’ve had so many people telling me that we don’t have time to do X. X could be refactoring, proper unit testing, or re-adopting our architecture to a changed business context. There seems to be this misconception that speed and quality are mutually exclusive. The data from the Code Red paper indicates that there is in fact no such trade-off. Instead, the reverse is true: we need high code quality in order to go fast.
To me, the increased throughput is largely an effect of the reduced uncertainty. Simple code has fewer surprises and less risk of breaking things. Healthy code also reflects a strong engineering culture, which most likely means that the organization has several other important practices in place.
Finally, it’s fascinating to note that the Code Red findings suggest a similar relationship as previously established by the Accelerate/DORA research: the shorter your cycle times for deployment, the fewer production failures.
Dealing with legacy or lower quality code
The code health numbers make it clear that we need to maintain a high bar. However, most of the time we aren’t writing new code. We have existing legacy code that we need to deal with. So how can you apply these techniques within that context?
Dealing with legacy and code quality issues at scale is challenging. Over the past decade, I’ve been fortunate to analyse more than 200 codebases. I’ve found that behavioral code analysis techniques are essential. The process I use is outlined below:
- Get situational awareness: where are the main issues, how severe are they, and how deep do they go?
- Summarize in a language that allows all stakeholders – engineering and management – to share the same understanding. The type of code health visualizations we looked at before are designed for this purpose.
- Prioritize based on impact: a common mistake is to set a quantitative goal (e.g. “we will reduce the number of code quality issues from 5000 issues down to 2500 over N months”). The thing is, no organization needs to fix all their technical debt. It’s just a common misconception stemming from ignoring the interest component of technical debt:
- Low-interest debt might be a future risk that we need to be aware of, but it’s definitely not an immediate concern; view it as a free loan.
- Instead, focus on development hotspots to remediate high-interest debt first. A hotspot analysis is essential in drawing this distinction.
- Make relevant progress visible; ,any organizations use the code health metrics as a KPI, and I find that valuable myself. However, true progress is measured in a business outcome. I recommend combining throughput metrics with a quality-focused feedback loop. Specifically, I have seen value in measuring trends in the amount of unplanned work (e.g. bug fixes, production crashes – anything that isn’t on the roadmap).
When technical debt is paid down, the amount of unplanned work also drops as we saw in the previous case study. This makes development more predictable and – a big bonus – the organization can focus on exciting features rather than reactive fixes. I’ve captured the details in my whitepaper on The Business Costs of Technical Debt.
Towards a data-driven approach for technical debt
In this article we saw that the waste due to technical debt is substantial and constitutes a real business risk. One explanation for why this waste is tolerated could be simply because the impact of technical debt hasn’t been possible to quantify at the source code level before.
The Code Red research introduced in this article gives us the option to challenge the status quo and elevate code quality to the level of a business KPI. Combined with the hotspot technique, we even make those findings actionable. Knowing — as a software company — where you have Red versus Green Code enables a data-driven approach to technical debt. Use it to your advantage.