Crap4J Seeks to Use Algorithms to Determine Code Quality
The goals of a new project, crap4j, are clear:
There is no fool-proof, 100% accurate and objective way to determine if a particular piece of code is crappy or not. However, our intuition backed by research and empirical evidence is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit the "This is crap!" response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into "Oh crap!"
Expanding upon the initial reactions, a more precise measurement is needed. Crap4j provides a single number that combines complexity and test code coverage.
Given a Java method m, CRAP for m is calculated as follows:
CRAP(m) = comp(m)^2 * (1 cov(m)/100)^3 + comp(m)
Where comp(m) is the cyclomatic complexity of method m, and cov(m) is the test code coverage provided by automated tests (e.g. JUnit tests, not manual QA). Cyclomatic complexity is a well-known and widely used metric and itís calculated as one plus the number of unique decisions in the method. For code coverage we use basis path coverage.
Like any metric, this number needs to be used with caution and not blindly followed, but it does provide a starting point to facilitate change and a way to pinpoint places in the code that are in more need of updating than others.
Low CRAP numbers indicate code with relatively low change and maintenance risk because itís not too complex and/or itís well-protected by automated and repeatable tests. High CRAP numbers indicate code that ís risky to change because of a hazardous combination of high complexity and low, or no, automated test coverage.
Crap4j can be downloaded from www.junitfactory.com/crap4j/update/, and is currently a plug-in for Eclipse. From more information, JUnitFactory has provided an introductory article.
Download link
by
Martin Gilday
Re: Download link
by
Bob Evans
Visualization
by
Pete Moore
One could use it to identify "crap", however we think this is a negative way to look at the world. We prefer to think of it as identifying areas where more testing might be most valuable.
E.g. the "risks" cloud for Lucene looks like this downloads.atlassian.com/software/clover/samples...
What's really interesting is if you compare the above cloud to the one without complexity the difference is startling downloads.atlassian.com/software/clover/samples...
Anyway, install the eclipse plugin from update.atlassian.com/eclipse/clover/
Re: Visualization
by
Jeffrey Fredrick
Someone else at Agitar reminded me about the Clover tag clouds again today. They look cool... but looking closer I don't really like them. I actually blogged about this today (Visualizing Complexity and Coverage).
What does it look like when have several thousand classes in your project?
The quick wins tab for the Lucene example does look cool, and much easier to read than the project risks tab. Is that typical across projects or something you only start to see when you have substantial coverage?
Oh, and what we've found is that the difference between "crap" and "risk" is if you inherited the code or not! ;)
Re: Visualization
by
Pete Moore
It scales to large as well as to tiny projects because it is relative to the project it is run on. E.g. if you are at NASA and your least covered class has 95% coverage, it will be bright red. Similarly if every class has high complexity, only the worst offenders will be large.
Quick wins has total elements on the size axis and uncovered elements on the color axis. Big red = lots of uncovered elements. If you look at PMD's quick wins cloud it throws up only a couple of candidates.
The quick wins cloud is often "easier to read" as you noticed for lucene, it is mostly useful when you are just trying to increase your overall coverage number. In most situations not a good thing(tm). We jokingly refer to project risks as the "developers report" and quick wins as the "managers report" ;) That said it is interesting (and fun) to compare quick wins to other clouds. At the moment there is only risk, but in the next major release we would like to make them user configurable. E.g. you could use crap4j's algorithm in a cloud.
An important thing to keep in mind, is that a visualization like a cloud (or the treemap in the eclipse plugin) are just launching points for further investigations, possible candidates for more testing or a refactor. They are not an end in themselves and never going to be a perfect representation of a projects risk (or anything else for that matter). Visualizations can definitely be useful for gleaning insight into an unfamiliar project, however, imho they are most valuable on the project one is intimately familiar with. In that case, the "candidates" often stand out like dogs balls.
Also note that clouds are just one small aspect of Clover 2. The real killer is seeing the per test coverage, i.e. which tests hit which code and the reverse what did *this* test hit. Because as useful as coverage is, well covered != well tested. It is also a great prefactoring aid.
Educational Content
Managing Build Jobs for Continuous Delivery
Martin Peston May 24, 2013
Clojure in the Field
Stuart Halloway May 23, 2013
Tuning the Size of Your Thread Pool
Kirk Pepperdine May 23, 2013




Hello stranger!
You need to Register an InfoQ account or Login to post comments. But there's so much more behind being registered.Get the most out of the InfoQ experience.
Tell us what you think