BT

Project Metrics for Software Development

Posted by Carlos Sirias on Jul 14, 2009 |

Since 2007, I have been involved in an effort to measure success for software development projects regardless of their methodology so that we can report to upper management. The following article presents some of the conclusions I personally made during this research in an effort to present to a broader audience the challenges we had and how they were addressed; it focuses on performance and not on progress metrics as I personally believe the set of second ones focus only on the present and have little impact on the team's future accomplishments,. I see progress metrics as a way to help your team achieve a goal timely, however unless they reflect on their performance their chances to improve get reduced; for example if a project manager keeps showing something like "Progress Against Schedule" the team will rush to recover their lost without stopping and thinking what went wrong and how to improve -since they won't have time- that's why I believe progress metrics are helpful but not complete.

You all might remember the famous quote: "If you can't measure it, you can't manage it" and if a company is unable to manage a software project, how will they know how to improve? When they have improved? And what's the real value added to a change introduced into the process? - For example a transformation to their software practices and principles- did someone mention "a switch from Waterfall to Agile"? Software project success has always been the goal of the industry; however the metrics that helps us measure the success have been as diverse as they could be. Depending on the particular methodology that you follow the set of suggested metrics will not have anything in common. We faced that challenge in Hewlett Packard as we had a diverse set of projects using different methodologies, so our upper management received mixed metrics depending on what the different organizations wanted to report.

For those Agile readers, we know their projects are uniquely well suited for metrics, as the data collection can be done seamlessly and without distraction of the team, however the set of metrics suggested might not be suitable for projects not using the principles and practices embraced by Agile things such as velocity, cumulative flow and burndown might not make sense for teams that have not embraced Agile. What if we want to measure projects for what they are projects, and not for what they use?

Management of the software development process requires the ability to identify measures that characterize the underlying parameters to control and aid continuous improvement of the project. After several attempts and research inside and outside the company we ended up struggling with a huge set of proposed metrics (around twenty) but in a very Agile way we sat with a team of experts to retrospect on each particular metric, first we eliminated all the proposed ones that dealt with "size" -of the project, artifacts, code, deliverables- as big companies manage all kinds of software projects, and we really asked ourselves: Do we really want to measure this? The answer was no as we first wanted to have a set of easy metrics that did not involved any burden on the team and that were more useful in determining team maturity and performance.

Finally we decided to keep it simple and defined three core metrics for all IT as follows:

Effort was defined as the total amount of time for a task that results in a work product or a service. The planned amount of time is called 'Planned Effort' and the actual amount of time spent is called the 'Actual Effort'; it can be measure in hours or in days depending on the specifics of the project. A task is a part of a set of actions which accomplishes a job, problem or assignment which should not be different for Waterfall or Agile projects. However we do acknowledge differences on how tasks are managed on using different methodologies; which can be seen as advantages or disadvantages of each. The following figure shows how this metric is summarized to the team.

Figure 1: Cumulative Person Effort Visual Display & Actual Effort distribution per phase

Effort is a directly observable quantity which is measured cumulatively for the tasks being assigned to specific resources or it can also be computed for specific tasks, milestones or phases. The 'Planned Effort' is collected when the work for a specific task is planned and is refined when the task is analyzed or designed. The 'Actual Effort' is collected during the construction, testing and warranty related to the specific task. At the beginning organizations should see big differences or gaps between their 'estimation' and 'actuals' however as teams "mature" the numbers tend to get closer. Our experience has been that if after a six month period the effort charts don't show closer trends the team must retrospect and find root causes, which can be related to factors inside or outside the team. Defining the constraint could be a great place to start as suggested by Poppendieck5.

Productivity was defined as how many "simple tasks" can be delivered by day. It can be computed at various levels within a project: individual, profile, task phase or project.

The definition "simple task" will always raise heated conversations; we finally define it as the amount of time required by a resource to deliver something and settle it to five hours (of course some simple tasks take less or more, but we settled for that number).

Given our definition "simple tasks that can be delivered by day -8 hours-" the formula was defined as:

Productivity = ((Planned Effort / 5) / Actual Effort) *8

The following figure shows how this metric is summarized to the team.

Figure 2: Productivity Visual Display

It's obvious that the metric can be challenged and a lot of valid arguments were raised during our definition, however we needed to fall for something that the majority will agree just like in every nice democracy. Once we had such a metric, we found its power to compare project health either by week, month, resource, etc.

The productivity metric begins at a task level and rolls up to a deliverable, phase and project level. It has an obvious dependency on the estimation technique and effort collecting tool.

Figure 3: Productivity Visual Display

The figure shows the cumulative productivity for two similar project releases worked by the same team, with the same members using the same metrics. First the team worked six months using their familiar Waterfall type of methodology (all analysis upfront, all coding in between, all test at the end) and then they spent six more months on a release while adopting Agile. The productivity formula applied to both project releases shows the typical trend Agilists usually tell of batches of work being moved from one phase (for example coding) to another phase (for example testing) which does not allow the team to deliver a quality product at a sustainable pace, therefore decreasing their productivity as defined by the previous formula (perhaps due to thrashing of switching from one task to another -an effort to do multitasking-). The same team using Agile principles and practices such as iterative development, focus on a given feature and fail early development (choosing risky requirements first) was able to increase their productivity from their previous project release.

With a common metric we have been able to measure the predictability in terms of delivery that Agile projects have, as well as their accuracy of estimations as the project matures.

Quality was defined as the number of: severe, medium or low defects delivered through the lifetime of the project. It contributes to identify the goodness of the deliverable to the end user. Each team needs to define what severe, low and medium means to their specific project. Quality should be reported throughout the life of the project; the later defects are caught, the more impact they will cause on the project. The following figure shows how this metric is summarized to the team.

Figure 4: Quality Visual Display

With defects collected, we also track a derived metric "defect removal". The goal of it is to evaluate what is the percentage of issues that are found late in the process (they obviously cost more) as opposed to the ones found early. Here we have also found some interesting behavior when comparing Agile to Waterfall type of projects.

Figure 5: Defect Removal Visual Display

The previous display shows how an Agile project has a more sustainable defect rate throughout the lifecycle of the software whereas Waterfall type of projects show a peak towards the end. The information was collected from two different product releases created by the same team (with a comparable set of requirements) but using the two approaches.

This set of metrics are constantly collected, analyzed and reported. We can enclose them in a "project dashboard" so as to have a holistic view and a way to share and interpret the results with the stakeholders.

There's an obvious correlation among the metrics, as shown by the following table

Metric 1 Metric 2 Positive Trend Negative Trend
Productivity Delivered
Defect
Density
A high productivity with high delivered defect density is an indicator of the aspect that planned effort for QA activities are insufficient in the project. While high productivity is desirable, there has to be a balance between productivity and quality for overall benefit. Hence, an optimal level productivity with good quality is desirable A high productivity with low delivered defect density is a good behavior and team could aim for further improvement in productivity while monitoring the delivered defect density level.
A high delivered defect density with low productivity indicates a need to fine tune, tailor or automate testing to detect issues early in the game.
Productivity Defect
removal
efficiency
A high productivity with high defect removal efficiency indicates a good balance between productivity and quality. Team can aim for further improvement in productivity while monitoring the defect removal efficiency. A high productivity with lower defect removal efficiency indicates that planned effort for QA activities are insufficient in the project.
A high defect removal efficiency with low productivity indicates a need to put some more attention to quality throughout the cycle.

Guilherme Souto, one of our Project Managers in Brazil R&D documented the following quick tips that will help us to be honest and rational driven during metrics adoption:

  • Metrics need to be linked to an organizational or project objective to demonstrate if we are achieving what we committed.
  • The metric needs to be integrated in the day by day workflow (as much as possible) in order to avoid having time allocated just to collect data.
  • There needs to be a verification that the chosen metrics are covering different areas that will define if a project reaches the end with or without success.
  • Metrics are like words, they will truly make sense in a set that creates a sentence, so as much as possible it is necessary to stay away from over analyzing data provided by one metric; when this set of metrics was introduced we kept seeing teams focusing too much on productivity, however the sense of isolating it by itself puts us in a bad position to take into analysis other variables such as how difficult, or challenging the project was or the level of quality that the team produced (because of their maturity, the process followed or others). At the end of the day we didn't wanted to commit the same mistakes we make when we look at progress metrics such as performance against schedule in isolation driving us to sacrifice quality or resources.
  • It is useless to use a metric that cannot be used to base or define action plans.
  • Target values need to be defined for metrics like performance, once in order to take corrective actions, the team must be aware of the expected minimum and upper limit value ranges.

The metrics provide only part of the history and their applicability to decision making depends on knowing the chain of events that trigger a specific change on a team trend. The approach and defined set of metrics might not be the best for your particular team, but it is what we have done so far and is working for us. Hopefully it will work for some of the readers but of course I could be way off.

References

The following are good starting points on the topics presented on this article.

1. H T. Goranson, The Agile Virtual Enterprise: Cases, Metrics, Tools. Quorum Books. 2003.

2. Tom DeMarco . Controlling Software Projects: Management, Measurement and Estimation. Prentice Hall. 1986

3. Dr. Cem Kaner, Software Engineer Metrics: What do they measure and how do we know? 10th International Software Metrics Symposium. 2004

4. Rüdiger Lincke, Jonas Lundberg, Welf Löwe: Comparing software metrics tools. International Symposium on Software Testing and Analysis. 2008

5. Poppendieck and Poppendieck, Implementing Lean Software Development: From Concept to Cash, Addison Wesley Professional, 2003.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Productivity by Piers Thompson

If I understand correctly then your productivity measure is actually a measure of estimation accuracy, and therefore the reasoning relating "productivity" and "discovered defect density" is bogus.





Rewriting your formula for "productivity" I get:




Productivity = (planned/actual)*k where k=8/5



Therefore if I plan that a particular task will take 1 hour, and it takes 1 hour then "Productivity" = 8/5 = 1.6



"productivity" > 1.6 means "faster than planned"


"productivity" < 1.6 means "slower than planned"




Incidentally, the use of defect counts to compare between dissimilar projects has precisely one thing going for it: it's easy to do. It doesn't tell you much but at least you don't waste much money collecting the metric.





P.

Re: Productivity by Carlos Sirias

Thanks for your reply Piers... We actually thought our definition was bogus... Yeah!!!! we did... however trying to find another definition proove to be really painfull and time consuming, after all how can you measure productivity?.... I know some methodologies have better definitions and I agree, however let's keep under perspective that I'm trying to have a metric that fits under most of them. So that was our reasoning behind the formula, not perfect... not even close, but at least something easy to uderstand and enough to get your team going on.

As you pointed out one of our premises was to have the simplest set of metrics that could possibly work... I hope we have show that. Thanks for your comments.

performance is not the only aspect to be measured by Mark Kofman


In your article you put too much attention on performance? Why performance? Why not measure expertise of the team, why not spend time evaluating scope and product size. On my opinion, it is important while measuring the software project to collect metrics informing you about different aspects of the project. Only that would give you whole picture of how well things are going.


Mark

ceo, programeter.com

Re: Productivity by Hal Arnold

So.. "it's really painful and time consuming, [and] how do you measure productivity..". But this is exactly why so many others have tried and failed. But it's really dangerous, because your audience and stakeholders won't understand how "bogus" it really is. They'll be happily expecting that you're REALLY improving productivity, when other more important things like technical debt are not being concentrated on by the team, because they are coding to your productivity metric.
It seems to me that you've measured how well the team is able to estimate; which is wonderful, but not what you want

Re: performance is not the only aspect to be measured by Carlos Sirias

Thanks for your reply Mark indeed we pay too much attention to "performance". Keep under perspective that I work for a roughly 5000 technologist shop, so we are still maturing. Measuring scope and product size is part of what we do when estimating, we have found very difficult to measure how well teams do this... the best we could do was our definition of "performance"

Re: Productivity by Carlos Sirias

Great comment Hal; I totally agree with you. Keep under perspective that I work for a roughly 5000 technologist shop, so we are still maturing, measuring technical debt in this kind of environment is really difficult, I would be interested in reading a proposal on that... I guess one thing we could do is have a really senior technologist audit this; but you can get the idea that this would be a full time job of not one but probably a whole organization of dozens (to cover 5000) first we need to sell this to upper management and it will take a while the industry trend reach us.

Measurements by Robert Fisher

I like this article, and it deserves prominence.

The definitions of measurements are important (reference the misunderstanding about "Productivity" and "Estimating Accuracy" above).

About 80% of our software development shop is Agile with long experience, and we need common measurements across Agile and traditional projects. We found it boils down to just three measurements:
1) Size (key for any endeavour -
c.f building a bridge "how big is it?")
2) Effort (the cost of work - estimated and actual)
3) Waste (in products and processes)
This is only part of the story - they all come with attributes, e.g. business benefit, activity, defect source? But we keep them few and simple.

Useful measurements are ratios, e.g. resource ability, productivity, accuracy, defect density, escape rates, progress, etc. All derived from the three basics plus their attributes.

As the article states, we found measurements must be tied to key business objectives, e.g. how good are our products?, how good are we? where are we with the project, etc? Unsurprisingly the three measurements are the same controls used by our professionsal engineers each day.

Most of all they are automatic - they are collected and analysed by the development tools at no cost to our professionals. Using spreadsheets invites failure. To do this, the sources, form, translation, charting, are statistical analysis are strongly defined and common in the tools we use. This is where all the hard work came in.

Re: Measurements by Carlos Sirias

Hi Robert

Thanks for your inputs. I concurr with you that every shop needs to accomplish some level of automation in collecting the data that feeds the metrics... that's a key so that we don't hurt the chemistry of our technical teams. Great feedback.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

8 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT