InfoQ

News

Cyclomatic Complexity Revisited

Posted by Gavin Terrill on Mar 31, 2008 09:56 PM

Community
Architecture
Topics
Delivering Quality ,
Code Analysis
Tags
Code Coverage

Cyclomatic Complexity is a software metric that is used to measure the complexity of a given piece of code. It does this by counting the number of execution paths through the code. For example, a block of code with no branching statements has a complexity of 1. If you add an if test, then there will be 2 paths, one where the condition is true, and one where it is false.

Software developers strive to build and maintain code with low complexity as it helps readability, and in theory reduces defect count. However, that theory has recently been challenged with the release of findings by the metrics dashboard provider Enerjy. Enerjy analyzed tens of thousands of source files, correlating defects against the Cyclomatic Complexity (CC):

The results show that the files having a CC value of 11 had the lowest probability of being fault-prone (28%). Files with a CC value of 38 had a probability of 50% of being fault-prone. Files containing CC values of 74 and up were determined to have a 98% plus probability of being fault-prone.

Andrew Binstock from SDTimes provided an interesting observation:

What Enerjy found was that routines with CCNs of 1 through 25 did not follow the expected result that greater CCN correlates to greater probability of defects. Rather, it found that for CCNs of 1 through 11, the higher the CCN the lower the bug probability. It was not until CCN reached 25 that defect probability rose sufficiently to be equal that of routines with a CCN of 1.

The Enerjy results have created some confusion around how CC is counted. Keith Braithwaite pointed out that Enerjy study counted CC at the file level, not the method level. Christopher Beck, commenting on The Quality of TDD, chimed in saying:

... it’s not CC (and shouldn’t be named CC). Rather it comes close to another metric called WMC or “Weighted Methods per Class”, which sums up CC values of a class.

Regardless of the purity of the Enerjy approach to computing CC, one thing is clear - if your CC is greater or higher than 74 there is a very good chance it is buggy.

Cyclomatic Complexity Revisited by Rich Sharpe Posted Apr 1, 2008 12:03 PM
Re: Cyclomatic Complexity Revisited by Gavin Terrill Posted Apr 1, 2008 4:20 PM
Not sure about this... by Jim Leonardo Posted Apr 1, 2008 5:32 PM
More on Code Metrics by Patrick Smacchia Posted Apr 4, 2008 3:16 AM
  1. Back to top

    Cyclomatic Complexity Revisited

    Apr 1, 2008 12:03 PM by Rich Sharpe

    These results have produced some response from the industry that we have found very interesting. In the original posting we did state that the results were for the Cyclomatic Complexity values per file and are not weighted for any other factors.

    The reason we released these finding in this way was because we had performed this work on over 200 different metrics and to analyze this at method level would have taken a lot more time (and was not required for our purposes at the time).

    Maybe we could have used ‘WMC’ (Weighted Methods per class) as the title of the metric, however there has been some confusion in the past in methods to calculate this, so we stuck with File Level CC value.
    Our aim was to produce findings on various metrics on actual projects to prove if any (or any combination) of these metrics could be predictors for buggy code. As very few studies on existing projects are available we posted these results to determine the level of interest these data would generate and to make the information available to anyone who wishes to build on it.

    Over the next year we hope to release more data from different metrics and hope people will find this just as interesting.

    Rich

  2. Back to top

    Re: Cyclomatic Complexity Revisited

    Apr 1, 2008 4:20 PM by Gavin Terrill

    Our aim was to produce findings on various metrics on actual projects to prove if any (or any combination) of these metrics could be predictors for buggy code. As very few studies on existing projects are available we posted these results to determine the level of interest these data would generate and to make the information available to anyone who wishes to build on it.
    Thanks for your comments Rich. I found these results very interesting. To me, the focus on a CC of 1 to 25 is misplaced. In our shop, we plan to treat those above a certain CC threshold as technical debt as they are the ones that really have the potential to hurt us, so they deserve the most attention.

  3. Back to top

    Not sure about this...

    Apr 1, 2008 5:32 PM by Jim Leonardo

    This seems to reinforce the idea that CCN<25 is your target... the variance between the maximum and minimum fault prob over this range seems likely to fall into a range where you can determine whether the cost of fixing it is worth the benefit.

  4. Back to top

    More on Code Metrics

    Apr 4, 2008 3:16 AM by Patrick Smacchia

    While CC is an important metric to measure quality, there are many more code metrics possible. See a list of more than 60 code metrics + recommendations here: http://www.ndepend.com/Metrics.aspx

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.