El-Habya'a” or The Technical Debt
Alright, we had a deadline to meet and time was tight. We had to fix bugs as quickly as possible. One of these bugs was especially resistant to all our attempts! Then, one of my colleagues took over the debugging task. He hardwired some of the values supposed to be retrieved from the database and won’t change for the first couple of months of the system operation and… voila…the system started to work like charm!
This colleague of mine calls this type of code hacks in his magnificently funny Egyptian slang "Habya'a" which translates to kludge!
Aside from my colleague and his creative slang, in 1992 Ward Cunningham called such bad coding practices “Technical Debt”, which is defined by Wikipedia as “The work that needs to be done before a particular job can be considered complete”1 , also Steve McConnell defines technical debt as “a design or construction approach that’s expedient in the short term but that creates a technical context in which the same work will cost more to do later than it would cost to do now.”2
Now, if we look at technical debt from pragmatic angle we can see that it is actually not always a bad thing. It is a toll you pay to ride the highway to delivery when deadline is Yesterday. Another friend of mine told me once “Technical debt is like parking in the no-parking area, it is wrong and may cost you a traffic ticket but sometimes I had to do it to catch an important appointment in a nearby building!”
So, sometimes cost to benefit justifies it! However technical debt must be paid back in a timely way because it has another similarity to its financial counterpart and that is: it has interest.
This interest is the amount of effort we need to pay each time we maintain the system because of tight coupling, too large classes, untested code or any other form of technical debt which makes code and/or design maintenance especially difficult.
From my observations, the interest amount on technical debt is not fixed but it rather increases with time. I mean every time, let’s say every sprint we maintain the system that possess a technical debt we spend more effort than pervious times. This happens because of the following two reasons:
- Maintenance most likely introduces additional amount of debt and that is because as long as we have messy parts of the system any maintenance will follow the same coding and/or design approach. This newly added technical debt will incur additional maintenance effort next time and so on.
- As time passes and because of not following design patterns and lack of documentation more developers apply different patches to the system based on each one’s assumptions of how the code or design piece is working. This inevitably introduces new bugs to the system. Fixing these new bugs may also introduce new bugs …etc.
Based on the above, each interest paid actually participates in increasing the debt incurred and therefore the interest is indeed a compound interest. Compound interest is calculated using the following exponential equation:
Yt = Y0(1+r)t
Where Yt is the debt value at sprint t, Y0 is the debt starting value, r is the growth rate, and t is sprint number. In the context of Agile projects “t” is an integer and in this case we can say that technical debt increases geometrically with time since geometric function is a special case of exponential function when “t” is always an integer value 3
So, when chaotic codebase accumulates the system becomes more fragile and difficult to maintain. Another side effect is that the productivity of the team is dampened as more and more of the team’s time is spent on maintenance.
In one particular project, such bad coding practices were tolerated for a long time, and when we finally came to the go-live time, the system was crashing on the drop of a hat! No refactoring was possible without huge impact on many parts of the system, therefore we decided to scrap the whole thing and do it again!
Figure 1 Stacked bar chart showing how technical debt interest increases geometrically over time. Values used in the chart are hypothetical and not real project data.
In the following section I outline a proposed process to manage technical debt. The initial assumption is that technical debt is a risk. This assumption is based on the definition of Risk as “an uncertain event that might affect at least one of the project’s objectives in terms of scope, schedule, cost or quality”4. Technical debt fits this definition well because it is a potential threat to the project that may negatively impact the project if it is not paid on time.
Being an Agilest, I will explain this process as part of a Scrum project. In fact I find managing technical debt for agile projects necessary because in agile methods, more than others, fast delivery pace may encourage quick and dirty coding style. Also in Agile projects we design and architect just-in-time and we rely on refactoring to catch up with any needed adjustments; therefore we are kind of always having a technical debt because we always have to optimize our code allowing design to emerge.
A Technical Debt Management Process is as in the following:
- Set a Technical Credit Limit (TCL) TCL is the maximum amount of Ideal Hours or User Story Points you wish to borrow. This limit can be calculated as a percentage of the total project size, for example 10%.
- Identify Technical Debt Opportunities - A Technical Debt Opportunity is a situation in which a team member wishes to bypass some of the good coding, design or testing practices for the sake of fast delivery. Identifying these opportunities should be done in the daily scrum meeting through group discussion.
- Log Technical Debt Tasks - For each Technical Debt opportunity two tasks need to be added to the Technical Debt Log with their sizes estimated in Ideal Hours or User Story Points. These two tasks are:
- Exploit Task: which is what we need to do when we decide to take advantage of the technical debt opportunity. This is the accumulation of the technical debt.
- Pay-Back Task: which is what we need to do when we decide to refactor the code or design. The size of this task represents how much we utilize from our TCL to pay back the debt incurred.
- Select the Tasks - At the sprint planning meeting select which technical debt opportunities you want to exploit. Technical debt opportunities should be selected based on the priority decided by the Product Owner. When an opportunity is selected then two things need to be done:
- Discount the size of the technical debt associated Pay-Back Task from the TCL.
- Add the technical debt associated Exploit Task to the Sprint Backlog and add its size to the total size of the project.
If we find that TCL is about to be exhausted then we need to:
- Add a Pay-Back Task to the sprint backlog.
- Increase TCL by an amount equal to the size of the Pay-Back task once this Pay-Back task is done.
This way we are using the TCL as a monitoring system to alert ourselves if technical debt starts to accumulate so that we work on restoring it to healthy levels.
Figure 2 Change of Technical Debt against Technical Credit Limit with exploit and pay-back tasks. When an Exploit task is done Technical Debt increases, and when Pay-Back task is done it decreases.
Managing Technical Debt as a Risk along with using Technical Credit Limit can effectively reduce the negative impact of technical debt while maximizing its benefit especially in Agile methods which are prone to abusing technical debt as a facility for expedited delivery.
About the Author
Yaser Marey is an Egyptian Software Engineer and a Project Manager, PMP and Scrum Master. For the last 14 years, he has been leading teams developing Enterprise Software Systems for national and regional customers. He’s enthusiast of Agile, Lean and Continues Improvement and he believes they are a necessity for Software Industry in Egypt and Middle East. Yaser shares his experience about software architecture and project management on his blog YasserOnline.
We become older, we can't do more sprints
But What about existing apps?, we have to do a lot of sprints while we become older :)
Regarding the project you did it again, why you as a project manager or QA supervisor allow "bad coding practices were tolerated for a long time" or developer should always the guilty? :)