BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Project Metrics for Software Development

Project Metrics for Software Development

Bookmarks

Since 2007, I have been involved in an effort to measure success for software development projects regardless of their methodology so that we can report to upper management. The following article presents some of the conclusions I personally made during this research in an effort to present to a broader audience the challenges we had and how they were addressed; it focuses on performance and not on progress metrics as I personally believe the set of second ones focus only on the present and have little impact on the team's future accomplishments,. I see progress metrics as a way to help your team achieve a goal timely, however unless they reflect on their performance their chances to improve get reduced; for example if a project manager keeps showing something like "Progress Against Schedule" the team will rush to recover their lost without stopping and thinking what went wrong and how to improve -since they won't have time- that's why I believe progress metrics are helpful but not complete.

You all might remember the famous quote: "If you can't measure it, you can't manage it" and if a company is unable to manage a software project, how will they know how to improve? When they have improved? And what's the real value added to a change introduced into the process? - For example a transformation to their software practices and principles- did someone mention "a switch from Waterfall to Agile"? Software project success has always been the goal of the industry; however the metrics that helps us measure the success have been as diverse as they could be. Depending on the particular methodology that you follow the set of suggested metrics will not have anything in common. We faced that challenge in Hewlett Packard as we had a diverse set of projects using different methodologies, so our upper management received mixed metrics depending on what the different organizations wanted to report.

For those Agile readers, we know their projects are uniquely well suited for metrics, as the data collection can be done seamlessly and without distraction of the team, however the set of metrics suggested might not be suitable for projects not using the principles and practices embraced by Agile things such as velocity, cumulative flow and burndown might not make sense for teams that have not embraced Agile. What if we want to measure projects for what they are projects, and not for what they use?

Management of the software development process requires the ability to identify measures that characterize the underlying parameters to control and aid continuous improvement of the project. After several attempts and research inside and outside the company we ended up struggling with a huge set of proposed metrics (around twenty) but in a very Agile way we sat with a team of experts to retrospect on each particular metric, first we eliminated all the proposed ones that dealt with "size" -of the project, artifacts, code, deliverables- as big companies manage all kinds of software projects, and we really asked ourselves: Do we really want to measure this? The answer was no as we first wanted to have a set of easy metrics that did not involved any burden on the team and that were more useful in determining team maturity and performance.

Finally we decided to keep it simple and defined three core metrics for all IT as follows:

Effort was defined as the total amount of time for a task that results in a work product or a service. The planned amount of time is called 'Planned Effort' and the actual amount of time spent is called the 'Actual Effort'; it can be measure in hours or in days depending on the specifics of the project. A task is a part of a set of actions which accomplishes a job, problem or assignment which should not be different for Waterfall or Agile projects. However we do acknowledge differences on how tasks are managed on using different methodologies; which can be seen as advantages or disadvantages of each. The following figure shows how this metric is summarized to the team.

Figure 1: Cumulative Person Effort Visual Display & Actual Effort distribution per phase

Effort is a directly observable quantity which is measured cumulatively for the tasks being assigned to specific resources or it can also be computed for specific tasks, milestones or phases. The 'Planned Effort' is collected when the work for a specific task is planned and is refined when the task is analyzed or designed. The 'Actual Effort' is collected during the construction, testing and warranty related to the specific task. At the beginning organizations should see big differences or gaps between their 'estimation' and 'actuals' however as teams "mature" the numbers tend to get closer. Our experience has been that if after a six month period the effort charts don't show closer trends the team must retrospect and find root causes, which can be related to factors inside or outside the team. Defining the constraint could be a great place to start as suggested by Poppendieck5.

Productivity was defined as how many "simple tasks" can be delivered by day. It can be computed at various levels within a project: individual, profile, task phase or project.

The definition "simple task" will always raise heated conversations; we finally define it as the amount of time required by a resource to deliver something and settle it to five hours (of course some simple tasks take less or more, but we settled for that number).

Given our definition "simple tasks that can be delivered by day -8 hours-" the formula was defined as:

Productivity = ((Planned Effort / 5) / Actual Effort) *8

The following figure shows how this metric is summarized to the team.

Figure 2: Productivity Visual Display

It's obvious that the metric can be challenged and a lot of valid arguments were raised during our definition, however we needed to fall for something that the majority will agree just like in every nice democracy. Once we had such a metric, we found its power to compare project health either by week, month, resource, etc.

The productivity metric begins at a task level and rolls up to a deliverable, phase and project level. It has an obvious dependency on the estimation technique and effort collecting tool.

Figure 3: Productivity Visual Display

The figure shows the cumulative productivity for two similar project releases worked by the same team, with the same members using the same metrics. First the team worked six months using their familiar Waterfall type of methodology (all analysis upfront, all coding in between, all test at the end) and then they spent six more months on a release while adopting Agile. The productivity formula applied to both project releases shows the typical trend Agilists usually tell of batches of work being moved from one phase (for example coding) to another phase (for example testing) which does not allow the team to deliver a quality product at a sustainable pace, therefore decreasing their productivity as defined by the previous formula (perhaps due to thrashing of switching from one task to another -an effort to do multitasking-). The same team using Agile principles and practices such as iterative development, focus on a given feature and fail early development (choosing risky requirements first) was able to increase their productivity from their previous project release.

With a common metric we have been able to measure the predictability in terms of delivery that Agile projects have, as well as their accuracy of estimations as the project matures.

Quality was defined as the number of: severe, medium or low defects delivered through the lifetime of the project. It contributes to identify the goodness of the deliverable to the end user. Each team needs to define what severe, low and medium means to their specific project. Quality should be reported throughout the life of the project; the later defects are caught, the more impact they will cause on the project. The following figure shows how this metric is summarized to the team.

Figure 4: Quality Visual Display

With defects collected, we also track a derived metric "defect removal". The goal of it is to evaluate what is the percentage of issues that are found late in the process (they obviously cost more) as opposed to the ones found early. Here we have also found some interesting behavior when comparing Agile to Waterfall type of projects.

Figure 5: Defect Removal Visual Display

The previous display shows how an Agile project has a more sustainable defect rate throughout the lifecycle of the software whereas Waterfall type of projects show a peak towards the end. The information was collected from two different product releases created by the same team (with a comparable set of requirements) but using the two approaches.

This set of metrics are constantly collected, analyzed and reported. We can enclose them in a "project dashboard" so as to have a holistic view and a way to share and interpret the results with the stakeholders.

There's an obvious correlation among the metrics, as shown by the following table

Metric 1 Metric 2 Positive Trend Negative Trend
Productivity Delivered
Defect
Density
A high productivity with high delivered defect density is an indicator of the aspect that planned effort for QA activities are insufficient in the project. While high productivity is desirable, there has to be a balance between productivity and quality for overall benefit. Hence, an optimal level productivity with good quality is desirable A high productivity with low delivered defect density is a good behavior and team could aim for further improvement in productivity while monitoring the delivered defect density level.
A high delivered defect density with low productivity indicates a need to fine tune, tailor or automate testing to detect issues early in the game.
Productivity Defect
removal
efficiency
A high productivity with high defect removal efficiency indicates a good balance between productivity and quality. Team can aim for further improvement in productivity while monitoring the defect removal efficiency. A high productivity with lower defect removal efficiency indicates that planned effort for QA activities are insufficient in the project.
A high defect removal efficiency with low productivity indicates a need to put some more attention to quality throughout the cycle.

Guilherme Souto, one of our Project Managers in Brazil R&D documented the following quick tips that will help us to be honest and rational driven during metrics adoption:

  • Metrics need to be linked to an organizational or project objective to demonstrate if we are achieving what we committed.
  • The metric needs to be integrated in the day by day workflow (as much as possible) in order to avoid having time allocated just to collect data.
  • There needs to be a verification that the chosen metrics are covering different areas that will define if a project reaches the end with or without success.
  • Metrics are like words, they will truly make sense in a set that creates a sentence, so as much as possible it is necessary to stay away from over analyzing data provided by one metric; when this set of metrics was introduced we kept seeing teams focusing too much on productivity, however the sense of isolating it by itself puts us in a bad position to take into analysis other variables such as how difficult, or challenging the project was or the level of quality that the team produced (because of their maturity, the process followed or others). At the end of the day we didn't wanted to commit the same mistakes we make when we look at progress metrics such as performance against schedule in isolation driving us to sacrifice quality or resources.
  • It is useless to use a metric that cannot be used to base or define action plans.
  • Target values need to be defined for metrics like performance, once in order to take corrective actions, the team must be aware of the expected minimum and upper limit value ranges.

The metrics provide only part of the history and their applicability to decision making depends on knowing the chain of events that trigger a specific change on a team trend. The approach and defined set of metrics might not be the best for your particular team, but it is what we have done so far and is working for us. Hopefully it will work for some of the readers but of course I could be way off.

References

The following are good starting points on the topics presented on this article.

1. H T. Goranson, The Agile Virtual Enterprise: Cases, Metrics, Tools. Quorum Books. 2003.

2. Tom DeMarco . Controlling Software Projects: Management, Measurement and Estimation. Prentice Hall. 1986

3. Dr. Cem Kaner, Software Engineer Metrics: What do they measure and how do we know? 10th International Software Metrics Symposium. 2004

4. Rüdiger Lincke, Jonas Lundberg, Welf Löwe: Comparing software metrics tools. International Symposium on Software Testing and Analysis. 2008

5. Poppendieck and Poppendieck, Implementing Lean Software Development: From Concept to Cash, Addison Wesley Professional, 2003.

Rate this Article

Adoption
Style

BT