How Kanban Works
Recently, there has been more and more interest in Kanban as a simple and effective method for managing software development and continuous improvement. But, how (or may be why) Kanban works? Is it because it exposes the system and enables visual tracking of requests? Or is it due to limiting work-in-process and reducing the wasteful effect of task switching? Or may be due to frequent and granular feedback it provides to managers through simple measurements like cycle time and throughput? In this article, we will dig into details and study Kanban in the light of queuing theory and Little’s Law1. Also, using case studies, we will illustrate three typical problems which face managers of Kanban development systems, and how to resolve them. This will reveal some basic concepts and insightful ideas about how Kanban works.
Little’s Law in Software Systems
Little’s Law (named after John Little) is one of the ideas which Kanban rests upon. In software development, this is how Little's Law is stated:
WIP = Th * CT
Where WIP (Work In Process) = average amount of unfinished items (bugs, user stories, change requests, etc.) in the development system
Th (Throughput) = Team output in unit of time
CT (Cycle Time) = Average time it takes the team to finish one item
The dynamics of Little’s law is amazing. It explains many complications of software development and inspires us with resolutions. To analyze the dynamics of the next case studies, we will use Diagrams of Effects2, which is an excellent tool for analyzing non-linear systems, or systems with more than two effects or influences affecting the system behavior.
Case 1: Increase Team Throughput:
Adam is a coach of a team of two developers and one tester responsible for maintaining a large number of products in the company. In 2013, the company invested in marketing the product and managed to double the number of customers. Now, Adam’s team started to receive an increasing number of support requests. However, the CEO is not willing to increase the size of the team.
In this case, in order to meet the growing number of customer requests, the team will have to increase their throughput. According to Little’s Law (Th = WIP / CT), in order to increase the team throughput, either reduce the cycle time or increase the WIP. The team cannot reduce the cycle time because they have fixed capacity. So, the easy solution is to increase the WIP.
The question is: Would increasing the WIP result in higher throughput? The answer is No. Larger WIP will increase throughput to a limit after which throughput will start decreasing as per the following graph:
Figure 1. Relation between Throughput and WIP
As described in the next graph, increasing the WIP will stimulate the team to optimize their work and remove kinds of wastes from their delivery process (yellow area) until the team reaches the maximum throughput possible (green peak). After that, more WIP may not cause any further improvement; in contrary, it may decrease the team throughput due to stress and task switching (red area).
Figure 2. Team response to increasing WIP according to the amount of WIP
At the red area, team is overwhelmed with external factors and experience internal issues which decrease their productivity. The next diagram of effects analyses the team dynamics at the red area:
Figure 3. Increasing WIP beyond team capacity will reduce team productivity and throughput
This diagram shows the effect of increasing the WIP above the team capacity. This will increase communication with customers, task switching, and stress. Working under stress and switching between tasks may result in more bugs and will eventually decrease productivity, which, in turn, will decrease Throughput.
To grasp the impact of this decision, the following diagram models the reinforcing effect of: increasing WIP which leads to less productivity which in turn will pile requests up and raise the WIP and so on. The system will keep looping and throughput will keep decreasing till the team hangs!
Figure 4. Increasing WIP beyond team capacity may cause two reinforcing loops. In this dynamic, WIP will keep increasing till the development system hangs!
Note: Two consecutive negative effect = positive effect
To summarize, if your capabilities are fixed and want to increase throughput, you may increase both team size and WIP. If this is not possible, you are left with only one other option: decreasing the cycle time, which is all about discovering and removing waste.
Case 2: Stabilize Cycle Time:
Ismail is the development manager committed to deliver changes to customers within a time limit specified in a Service Level Agreement (SLA). Ismail and his team are receiving fluctuating input rates of customer requests. Sometime it is more than usual, causing the cycle time to exceed their SLA and other times, the rate is law and do not consume all their capacity. The following diagram explains why CT increases due to higher input rates:
Figure 5. Higher input rates result in larger cycle time. Also, lower input rates result in smaller cycle times, as per Little's Law
To stabilize the cycle time, Little’s Law implies that CT is directly proportional to WIP and inversely proportional to Throughput. So, if Ismail could stabilize these two parameters, the cycle time would stabilize accordingly.
To do that, Ismail should control both WIP and team capacity (the number of team members or the percentage of team time dedicated to handling customer requests) to respond to higher or lower input rates. These two parameters has double effect on CT, increasing WIP will increase CT, while increasing team capacity will decrease CT. So, both effects will cancel out:
Figure 6. If managers can increase/decrease WIP and team capacity, they could stabilize CT and optimize team utilization and performance.
So, in summary, if the team is experiencing fluctuating input rates, they should control two parameters, WIP limits and team capacity. By controlling these two parameters, team may stabilize cycle time and optimize team utilization.
Case 3 – Do not expedite too much!
Expediting work (using a priority lane in your Kanban board) may be an easy solution for recurring trouble reports and problematic service requests, especially with unhappy customers. In many cases, it is tempting to expedite with no rules, just to relieve special cases or respond to customer loud complaints.
The expedite lane will consume part of team’s time and effort, and will slow down development and increase the average cycle time. According to Little’s Law, this will effectively reduce team’s throughput:
Figure 7. As per Little's Law, if WIP is fixed, the higher the Cycle Time, the lower Throughput will become
What actually happens, however, is that the linear relationship between cycle time and throughput will changes as well, as follows:
Figure 8. Throughput is dramatically reduced due to two factors, the increase in cycle time and the shift in the graph defining the relationship between Th and CT
In more acute case, teams may switch to tasks in the expedite lane and start handling them instantly one requested and leave whatever in-progress tasks in progress. Instant context-switching has another even more negative impact on throughput. The following diagram explains this impact:
Figure 9. Expedited items raises the cycle time of items still in progress; and negatively affects throughput at the end of the day.
To summarize the third case, beware of the expedite lane trap. It may have negative effect on the overall team productivity, and may result in lower average cycle time. While this may be useful in some minor emergency cases, it may open the door for other unintended negative dynamics.
So, in these three cases, we have demonstrated how Kanban works in the light of queuing theory. It is very simple and very effective management tool. As a manager or team leader, you have several parameters to control; WIP and team capacity. Also, you have process metrics like Cycle Time and Throughput which you can very easily measure and get frequent feedback about process efficiency. In the last three examples, we pointed out three basic issues when using Kanban:
- Try to study your team capacity and not overwhelm the team with extra work beyond this capacity. Plotting WIP against throughput would give you an indication about the maximum WIP they can endure.
- You may control more than one parameter to stabilize the development process. As in the second case, you may control WIP and team capacity so that to reach a stable cycle time.
- Beware of the expedite lane trap. In essence, it is a backdoor for process violation. If used without care, it may undermine team productivity.
In a book chapter explaining Little's Law; published by Massachusetts Institute of Technology, John Little explains the law and bridges theory and practice. It is an excellent read which is simple and yet digs into the heart of Little's Law.
One of the issues which this book chapter explains very well is the difference between Little's law in its original formulation (which considers the Arrival Rate as one of the formula parameters) and its application in manufacturing systems (which replaces Arrival Rate with throughput). Here is an explanation:
Little's Law state that:
L = λ W
Where L = average number of items in the queuing system
λ = arrival rate of new items to the system
W = average wait time for an item in the system
John Little argues that this law is robust, generic, and holds exactly to any queuing system given an essential condition: "to have a finite window of observation that starts [when the system is empty] and stops when the system is empty" (p.88).
As you may notice, this formulation is different from the one discussed in the article. Actually, there are two basic differences between Little’s Law in its original formulation and how it is stated in the software context (WIP = Th*CT). The original one talks about input or arrival rate, whereas the latter talks about output rate or throughput. The second issue is the condition which Little stated: that the system should start with 0 items and end at 0 items. In software, we rarely witness a system which has no maintenance requests.
To resolve these differences, Little indicated a more subtle condition of the Law to hold, which is: no items to go inside the system and get lost or do not get finished, which he calls “conservation of flow” (p. 93). If we apply this condition to the system, we can easily show that the input rate = output rate, and therefore, we can relate Lambda (λ) with throughout (th).
For the second condition (system should start and end in 0 items), Little argues that the law “still applies, at least as an approximation, as long as we select a time interval that is long enough”. (p. 93)
Diagrams of Effects were first introduced in the famous four-volume series: Quality Software Management by Gerald Weinberg. It is a great enabler when trying to understand the dynamics of a system exhibiting non-linear behavior, very similar to software development team systems.
Diagrams of Effects are similar to Causal Loop Diagrams (CLD), but it is slightly different in notation and is more powerful in modeling human interventions in the system. A diagram of effects consists primarily of nodes and arrows. Each node corresponds to a measurable quantity. Easy arrow corresponds to an effect (whether positive or negative) which the source node exhibit on the target node. This is a full manual of how to draw and use Diagrams of Effects.
About the Author
Amr Noaman Abdel-Hamid is an agile coach, trainer and practitioner whose life vision is to spread agile and lean thinking in Egypt and the Middle East. Amr co-founded Agile Academy to help teams and organizations deliver software at their maximum potential. Amr is a founding member in the Egypt Lean & Agile Network and had the honor to initiate Egypt's GoAgile program, an agile adoption initiative sponsored by the Egyptian government for expanding agile in Egypt. Since then, Amr has trained 400+ practitioners, coached many teams. Amr is a frequent speaker and author of several industrial reports and shares his thoughts in his blog: Tales of Agile Software Development. You can reach Amr through email, Linked-in, or twitter @amrnoaman.
How Can We Use Our Creative Power and Technological Opportunity to Address the Challenges of the 21st Century?
Gyorgyi Galik Feb 26, 2015