New-age Transactional Systems - Not Your Grandpa's OLTP
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Gavin Terrill on Mar 31, 2008
Cyclomatic Complexity is a software metric that is used to measure the complexity of a given piece of code. It does this by counting the number of execution paths through the code. For example, a block of code with no branching statements has a complexity of 1. If you add an if test, then there will be 2 paths, one where the condition is true, and one where it is false.
Software developers strive to build and maintain code with low complexity as it helps readability, and in theory reduces defect count. However, that theory has recently been challenged with the release of findings by the metrics dashboard provider Enerjy. Enerjy analyzed tens of thousands of source files, correlating defects against the Cyclomatic Complexity (CC):
The results show that the files having a CC value of 11 had the lowest probability of being fault-prone (28%). Files with a CC value of 38 had a probability of 50% of being fault-prone. Files containing CC values of 74 and up were determined to have a 98% plus probability of being fault-prone.
Andrew Binstock from SDTimes provided an interesting observation:
What Enerjy found was that routines with CCNs of 1 through 25 did not follow the expected result that greater CCN correlates to greater probability of defects. Rather, it found that for CCNs of 1 through 11, the higher the CCN the lower the bug probability. It was not until CCN reached 25 that defect probability rose sufficiently to be equal that of routines with a CCN of 1.
The Enerjy results have created some confusion around how CC is counted. Keith Braithwaite pointed out that Enerjy study counted CC at the file level, not the method level. Christopher Beck, commenting on The Quality of TDD, chimed in saying:
... it’s not CC (and shouldn’t be named CC). Rather it comes close to another metric called WMC or “Weighted Methods per Class”, which sums up CC values of a class.
Regardless of the purity of the Enerjy approach to computing CC, one thing is clear - if your CC is greater or higher than 74 there is a very good chance it is buggy.
Agile Development: A Manager's Roadmap for Success
Case Study: IBM's Agile Transformation
In today’s hyper-competitive world, later may be too late to adopt Agile development and this Roadmap for Success will help you get started. Download "Agile Development: A Manager's Roadmap for Success" now!
These results have produced some response from the industry that we have found very interesting. In the original posting we did state that the results were for the Cyclomatic Complexity values per file and are not weighted for any other factors.
The reason we released these finding in this way was because we had performed this work on over 200 different metrics and to analyze this at method level would have taken a lot more time (and was not required for our purposes at the time).
Maybe we could have used ‘WMC’ (Weighted Methods per class) as the title of the metric, however there has been some confusion in the past in methods to calculate this, so we stuck with File Level CC value.
Our aim was to produce findings on various metrics on actual projects to prove if any (or any combination) of these metrics could be predictors for buggy code. As very few studies on existing projects are available we posted these results to determine the level of interest these data would generate and to make the information available to anyone who wishes to build on it.
Over the next year we hope to release more data from different metrics and hope people will find this just as interesting.
Rich
Our aim was to produce findings on various metrics on actual projects to prove if any (or any combination) of these metrics could be predictors for buggy code. As very few studies on existing projects are available we posted these results to determine the level of interest these data would generate and to make the information available to anyone who wishes to build on it.
Thanks for your comments Rich. I found these results very interesting.
To me, the focus on a CC of 1 to 25 is misplaced. In our shop, we plan to treat those above a certain CC threshold as technical debt as they are the ones that really have the potential to hurt us, so they deserve the most attention.
This seems to reinforce the idea that CCN<25 is your target... the variance between the maximum and minimum fault prob over this range seems likely to fall into a range where you can determine whether the cost of fixing it is worth the benefit.
While CC is an important metric to measure quality, there are many more code metrics possible.
See a list of more than 60 code metrics + recommendations here:
www.ndepend.com/Metrics.aspx
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).
Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.
4 comments
Watch Thread Reply