BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles #NoEstimates Project Planning Using Monte Carlo Simulation

#NoEstimates Project Planning Using Monte Carlo Simulation

 

Customers come to us with a new product idea and they always ask the questions - how long will it take and how much will it cost us to deliver? Reality is uncertain, yet we as software developers are expected to deliver new products with certainty.

To increase the chances of project success we need to incorporate the uncertainty in our planning and exploit it. We can’t control the Waves of Uncertainty, but we can learn How to Surf! We do that by planning using reference class forecasting which promises more accuracy in forecasts by taking an "outside view" on the project being forecasted based on knowledge about actual performance in a reference class of comparable projects. This approach aligns with the #NoEstimates paradigm which aims at “exploring alternatives to estimates [of time, effort, cost] for making decisions in software development” (Zuill, 2013). To me #NoEstimates means “No effort estimates” which stands both for “Effortless estimates” or estimating with minimal effort and for “Not using estimates of effort”.

Deterministic planning used these days forces certainty on uncertain situations and masks the uncertainty instead of highlighting it. It calculates the project-specific costs based on a detailed study of the resources required to accomplish each activity of work contained in the project’s work breakdown structure or in other words, taking an “inside view” on the project being estimated. For high-level planning, deterministic estimation of all work items is wasteful of people’s time and infers precision when it isn’t present.

The techniques presented here are fast and for most of the projects they will produce more accurate results.

High-level probabilistic planning

Present day’s project management paradigm is based on the 1st principle of Scientific Management namely “In principle it is possible to know all you need to know to be able to plan what to do” (Taylor, 2006). It does recognize that uncertainty play a role in project management but believes uncertainty could be eliminated by a more detailed planning. It models projects as a network of activities and calculates the time needed to deliver a project by estimating the effort required to accomplish each activity of work contained in the project’s work breakdown structure (PMI, 2009).

We argue that planners could not know everything they needed to know and that the world as such is uncertain and every number is a random variable. We challenge the project management paradigm and suggest that for planning purposes it is better to model projects as a flow of work items through a system.

Hence the definition - a project is a batch of work items each one representing independent customer value that must be delivered on or before due date. The batch contains all the work items that need to be completed to deliver a new product with specified capabilities. In order to prepare the batch the product scope needs to be broken down into work items each one representing independent customer value. Even for a quality related requirements such as “the system should scale horizontally” we need to have a work item. It is important that each one of the work items can be delivered in any order like the user stories created following the INVEST mnemonic. We don’t try to estimate the size of the work items. There are only two "sizes" - "small enough" and "too big". The two sizes are context specific. They have no correlation to the "effort" needed. "Too big" should be split and not allowed to enter the backlog.

Probabilistic high-level plan forecasts the initial budget and also the range of the time frame for a project. We don’t plan in detail what is not absolutely necessary to plan. The short-term details, like the scheduling, are done based on the immediate needs and capabilities – and we create these schedules upon the execution of the high-level plan. When executing the high-level plan we have to keep focus on the project intent but we can never be certain which paths will offer the best chances of realizing it. We exploit uncertainty by making a series of small choices which open up further options then observe the effects of our actions and exploit unexpected successes.

We plan probabilistically by using reference class forecasting which does not try to forecast the specific uncertain events that could affect the new project, but instead places the project in a statistical distribution of outcomes from a class of reference projects.

Reference class forecasting

Reference Class Forecasting is based on the work of the Princeton’s psychologist Daniel Kahneman who won the Nobel Prize in economics in 2002. 

Reference class forecasting for a particular project requires the following three steps (Flyvbjerg, 2007):

  1. Identification of a relevant reference class of past, similar projects. The class must be broad enough to be statistically meaningful but narrow enough to be comparable with the specific project.
  2. Establishing a probability distribution for the selected reference class. This requires access to credible, empirical data for a sufficient number of projects within the reference class to make statistically meaningful conclusions.
  3. Comparing the new project with the reference class distribution, in order to establish the most likely outcome for the new project.

Let’s apply the reference class forecasting method for forecasting the delivery time for a new project.

Identification of a relevant reference class

The projects in the reference class should have comparable:

  • Team structures
  • Technologies used
  • Development processes used and the methods of capturing the requirements
  • Client types
  • Business domains

Please note that along with the internal characteristics of the projects we also compare the contexts in which projects were executed. The same team may have different performance if the client is a startup or Fortune 500 corporation due to the different way of collaboration with the stakeholders. On the other hand when comparing the projects we should not go into great details. Our goal is to establish a reference class that is broad enough to be statistically meaningful but narrow enough to be comparable with the new projects we will be working on.

Establishing a probability distribution for the selected reference class

We need to decide the metric for which we will establish the probability distribution. The metric should allow for taking an outside view on the development system that worked on the project, allow for calculating delivery time and should also make sense from client’s perspective. Takt Time is such a metric. Takt Time is the rate at which a finished product needs to be completed in order to meet customer demand. It is defined as the ratio of the available production time divided by customer demand. In other words Takt Time is the average time between two successive deliveries to the customer.

In what units of time we measure Takt Time? In manufacturing they measure Takt Time in hours, minutes even in seconds for the mass production. In knowledge work we measure Takt Time in days.
Here is a diagram presenting the delivery rate for a fictitious project. Each yellow box represents a work item delivered to the customer. Here we are again taking an outside view and are not interested in the “size” of each work item or in the way development system works internally.

On the left we have the start date for the project and on the right we have the end date. We can see that five days after the project started the first work item was delivered. Its Takt Time is 5 days. Seven days after that two new work items were delivered. Now what is their Takt Time? The first work item has a Takt Time of seven days, but the second one has a Takt Time of zero days. That is because the time between the two work items is zero days. It is not zero minutes but since we measure Takt Time in days it is zero days. Two days after that three new work items were delivered. According to the definition of Takt Time one of them has Takt Time of two days but the other two work items both have Takt Time of zero days. And we see how it went – eventually all 10 work items were delivered.

Important thing to note here is that the sum of all Takt Time values equals the delivery time of the project – in this case 22 days.
Here is a histogram of the Takt Time for the above delivery rate. Note the number of work items with Takt Time of zero.

Takt Time is calculated by dividing the time T over which the project is or will be delivered by the number of work items delivered.

  • T is the time period over which the project will be delivered
  • N is the number of items to be delivered in [0,T]
  •  is the Takt Time

In our project we have 22 days delivery time and we have 10 stories delivered hence we have Takt Time of 2.2 days.

That means that on average the time between two successive deliveries is 2.2 days. Note that it is an unqualified average, a single number without variance.

If we know the Takt Time for the system and we have a number of N work items to be delivered we can calculate how much time will take the system to deliver all N work items. The formula is Takt Time times the number of work items to be delivered. 

For instance if we have to deliver 45 stories and the Takt Time is 2.2 days then it will take the system 99 days to deliver.

Here comes an important point – because Takt Time is the average value of a random variable then we have a chance of missing the forecast. To get better odds, we need to use the probability distribution of Takt Time.
We usually don’t know how the Takt Time is distributed. How could we find that out?

Enter bootstrapping

Using historical samples via bootstrapping we can infer the distribution of Takt Time and its likelihoods.

Bootstrapping is based on the assumption that the sample is a good representation of the unknown population. Bootstrapping is done by repeatedly re-sampling a dataset with replacement, calculating the statistic of interest and recording its distribution. It does not replace or add to the original data.

In this case the statistic of interest is the average time between two successive deliveries or Takt Time.

Bootstrapping the distribution of Takt Time

Now let’s see how bootstrapping can be applied for inferring the Takt Time distribution. The steps are as follows:

  1. Have Takt Time (TT) sample of size n
  2. Have the number of work items delivered (N)
  3. Draw the same number of observation as the sample size n TTi with replacement out of the sample from step 1
  4. Calculate Project Delivery time (T) for the sample from step 3 using T = TTi              
  5. Calculate Takt Time  using T from step 4 and N from step 2
  6. Repeat many times
  7. Prepare distribution for Takt Time

Here is the method applied using the Takt Time data for out fictitious project.

And here is the Takt Time histogram bootstrapped using data from the fictitious project. Note the Median, Mean and the 85th percentile.

Now we have the probability distribution of Takt Time for our fictitious project. Note that the average value of 2.26 is very close to the Takt Time we calculated initially. Now we have not only the average but also the mode, the median and the percentiles. This Takt Time distribution represents a context specific uncertainty and is unique per context (team structure, delivery process used, technology, business domain and client type). This distribution should be preserved in a library of reference classes to be used for forecasting new projects implemented in the same context. By using it both theoretical knowledge and effort are greatly reduced, facilitating the use of probabilistic modeling. This distribution will be invalidated if any of the following is changed: team structure, development process, technology being used, client type and business domain.

Comparing the new project with the reference class distribution

Important thing to note is that  T = NTT assumes linear delivery rate. Do projects have linear delivery rate? Not really. This is a diagram that visualizes the rate at which the work items were delivered from a real project.

(Click on the image to enlarge it) 

On the X axis we have the project time in days. On the Y axis we have the number of work items delivered each day. It turns out that the delivery rate follows a “Z-curve pattern” (Anderson, 2003) as visualized by the red line.

The Z-curve can be divided in three parts or we can say it has three legs. There is empirical evidence that 20% of the time the delivery rate will be slow. Then for 60% of the time we’ll go faster or it’s “the hyper productivity” period. And for 20% till the end we’ll go slowly. Of course numbers may vary depending on the context but the basic principle about the three sections is correct.

Each leg of the Z-curve is characterized by:

  • Different work type
  • Different level of variation
  • Different staffing in terms of headcount and level of expertise

Only the second Z-curve leg is representative for the system capability. It shows the common cause variation specific to each system. First and third Z-curve legs are project specific and are affected by special cause variation.

The first leg of the Z-curve is the time when the developers climb the learning curve and setup their minds for the new project. But this leg of the Z-curve could also be used for:

  • conducting experiments to cover the riskiest work items
  • Innovation!
  • setting up environments
  • adapting to client’s culture and procedures
  • understanding new business domain
  • mastering new technology

All above are examples of special causes of variation specific to a project.

The second leg of the Z-curve is the productivity period. If the project is scheduled properly the system should be like clockwork – sustainable pace, no stress, no surprises…

The third leg of the Z-curve is when the team will clean up the battlefield, fix some outstanding defects and support the transition of the project deliverable into operation.

Project delivery time T

Project delivery time can be represented as the sum of the durations of each one of the three legs of the Z-curve. In other words it equals the duration Tz1 of the 1st leg plus the duration Tz2 of the 2nd leg plus the duration Tz3 of the 3rd leg of the Z-curve.

T = Tz1 + Tz2 + Tz3

Let’s substitute the duration of each of the three legs with the formula T = NTT. Now we have a new formula that, if we know the Takt Time and the number of work items to be delivered during each of the three legs of the Z-curve, will allow us to calculate how much time will take the system to deliver all N work items where N = Nz1 + Nz2 + Nz3 is the total number of work items for the project.

T = Nz1TTz1 + Nz2TTz2 + NzTTz3

Here we are calculating the delivery of Nz1 work items with Takt Time  during the 1st leg of the Z-curve plus Nz2 work items with Takt Time  during the 2nd leg of the Z-curve and Nz3 work items with Takt Time  during the 3rd leg of the Z-curve. This calculation is not credible because it is using Takt Time as a single number and we know we should use a distribution of the Takt Time instead. We need distributions of Takt Time for each one of the three legs of the Z-curve. We already know how to do that using bootstrap. Then we have to sum them but by definition they are random variables. How could we sum up random variables? Here comes Monte Carlo analysis. Monte Carlo simulation is a tool for summing up random variables(Savage, 2012).

Monte Carlo simulation of Project Delivery Time (T) based on Z-curve

The steps are as follows:

  1. Have three Takt Time distributions ()each one of size n for each of the three legs of the Z-curve
  2. Have the number of work items to be delivered in each of the three legs of the Z-curve (Nz1, Nz2, Nz3) where NNz1 + Nz2+ Nz3
  3. Draw one observation out of the n, with replacement from each of ()
  4. Calculate Project Delivery time (T) for the sample from step 3 using T = Nz1TTz1 + Nz2TTz2 + Nz3TTz3
  5. Repeat many times
  6. Prepare Delivery time (T) probability distribution

Let’s see how we can apply the above algorithm using some real data.

Let’s have a new project that we have to plan and provide the customer with delivery date. We already have a reference class of projects and when we compare the new project with the reference class we see that the new project is for the same customer, the same team will be working on it, using the same technology. For the reference class we also have the Takt Time distributions for each of the three legs of the Z-curve.

(Click on the image to enlarge it) 

After some analysis the team has broken down the new project scope into user stories and then has added some more work items to account for Dark Matter and Failure Load. After that the team decided that 12 stories will be delivered by the 1st leg of the Z-curve, 70 stories will be delivered by the 2nd leg of the Z-curve and 18 stories or work items will be delivered by the 3rd leg of the Z-curve.

If we visualize the Takt Time values using their respective distributions then the Monte Carlo simulated summation of…

(Click on the image to enlarge it) 

…will give us the time needed to deliver the project!

We are simulating this summation say 50,000 times. That will give us the simulated time needed to deliver the new project.

We end up with a histogram of the projected delivery time for our new project. We are interested in the Median, Average and the 85th percentile of the project delivery time (T) and the shape of the distribution. Based on the Projected Delivery Time histogram we can take the 85th percentile and use it as single number.

For this project the 85th percentile is 90 days. So 6 times out of 7 we should have the project delivered in 90 days or less.

Conclusions

By taking an outside view when forecasting a new project we will produce more accurate results faster than using the deterministic inside view. The method presented can be used by any team that uses user stories for planning and tracking project execution no matter the development process used (Scrum, XP, kanban systems).

My hope is that you will start using the techniques presented here for planning your next project. And don’t forget that even if we can’t control the waves of uncertainty we can learn how to surf!

References

Anderson, D. J. (2003). Agile Management for Software Engineering: Applying the Theory of Constraints for Business Results. Prentice Hall.

Flyvbjerg, B. (2007). Eliminating Bias in Early Project Development through Reference Class Forecasting and Good Governance. Trondheim, Norway: Concept Program, The Norwegian University of Science and Technology.

PMI. (2009). A Guide to the Project Management Body of Knowledge. Project Management Institute.

Savage, S. L. (2012). The Flaw of Averages: Why We Underestimate Risk in the Face of Uncertainty. Wiley.

Taylor, F. W. (2006). The Principles of Scientific Management. Cosimo Classics.

Zuill, W. (2013, May 17). The NoEstimates Hashtag. 

About the Author

Dimitar Bakardzhiev is the Managing Director of Taller Technologies Bulgaria and an expert in driving successful and cost-effective technology development. As a LKU Accredited Kanban Trainer (AKT) Dimitar puts lean principles to work every day when managing complex software projects. Dimitar has been one of the evangelists of Kanban in Bulgaria and has published David Anderson’s Kanban book as well as books by Goldratt and Deming in the local language.

 

Rate this Article

Adoption
Style

BT