BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

Crap4J Seeks to Use Algorithms to Determine Code Quality

by Ian Roughley on Oct 24, 2007 |

The goals of a new project, crap4j, are clear:

There is no fool-proof, 100% accurate and objective way to determine if a particular piece of code is crappy or not. However, our intuition backed by research and empirical evidence is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit the "This is crap!" response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into "Oh crap!"

Expanding upon the initial reactions, a more precise measurement is needed. Crap4j provides a single number that combines complexity and test code coverage.

Given a Java method m, CRAP for m is calculated as follows:

CRAP(m) = comp(m)^2 * (1  cov(m)/100)^3 + comp(m)

Where comp(m) is the cyclomatic complexity of method m, and cov(m) is the test code coverage provided by automated tests (e.g. JUnit tests, not manual QA). Cyclomatic complexity is a well-known and widely used metric and itís calculated as one plus the number of unique decisions in the method. For code coverage we use basis path coverage.

Like any metric, this number needs to be used with caution and not blindly followed, but it does provide a starting point to facilitate change and a way to pinpoint places in the code that are in more need of updating than others.

Low CRAP numbers indicate code with relatively low change and maintenance risk because itís not too complex and/or itís well-protected by automated and repeatable tests. High CRAP numbers indicate code that ís risky to change because of a hazardous combination of high complexity and low, or no, automated test coverage.

Crap4j can be downloaded from www.junitfactory.com/crap4j/update/, and is currently a plug-in for Eclipse. From more information, JUnitFactory has provided an introductory article.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Download link by Martin Gilday

The correct Eclipse download site is www.junitfactory.com/crap4j/update/

Re: Download link by Bob Evans

Actually, Crap4j has it's own home now at www.crap4j.org. You can download it from there at www.crap4j.org/download.html

Visualization by Pete Moore

Clover 2 provides a very neat visualization of complexity vs coverage (in html and in eclipse).

One could use it to identify "crap", however we think this is a negative way to look at the world. We prefer to think of it as identifying areas where more testing might be most valuable.

E.g. the "risks" cloud for Lucene looks like this downloads.atlassian.com/software/clover/samples...

What's really interesting is if you compare the above cloud to the one without complexity the difference is startling downloads.atlassian.com/software/clover/samples...

Anyway, install the eclipse plugin from update.atlassian.com/eclipse/clover/

Re: Visualization by Jeffrey Fredrick

Hi Pete!

Someone else at Agitar reminded me about the Clover tag clouds again today. They look cool... but looking closer I don't really like them. I actually blogged about this today (Visualizing Complexity and Coverage).

What does it look like when have several thousand classes in your project?

The quick wins tab for the Lucene example does look cool, and much easier to read than the project risks tab. Is that typical across projects or something you only start to see when you have substantial coverage?

Oh, and what we've found is that the difference between "crap" and "risk" is if you inherited the code or not! ;)

Re: Visualization by Pete Moore

The colored cloud is a great way of displaying dense multidimensional data - the Project Risks cloud has complexity on one axis (size) and coverage on the other (color). In PMD's project risks cloud there are 545 classes and it quite clearly throws up a couple of dozen candidates for investigation.

It scales to large as well as to tiny projects because it is relative to the project it is run on. E.g. if you are at NASA and your least covered class has 95% coverage, it will be bright red. Similarly if every class has high complexity, only the worst offenders will be large.

Quick wins has total elements on the size axis and uncovered elements on the color axis. Big red = lots of uncovered elements. If you look at PMD's quick wins cloud it throws up only a couple of candidates.

The quick wins cloud is often "easier to read" as you noticed for lucene, it is mostly useful when you are just trying to increase your overall coverage number. In most situations not a good thing(tm). We jokingly refer to project risks as the "developers report" and quick wins as the "managers report" ;) That said it is interesting (and fun) to compare quick wins to other clouds. At the moment there is only risk, but in the next major release we would like to make them user configurable. E.g. you could use crap4j's algorithm in a cloud.

An important thing to keep in mind, is that a visualization like a cloud (or the treemap in the eclipse plugin) are just launching points for further investigations, possible candidates for more testing or a refactor. They are not an end in themselves and never going to be a perfect representation of a projects risk (or anything else for that matter). Visualizations can definitely be useful for gleaning insight into an unfamiliar project, however, imho they are most valuable on the project one is intimately familiar with. In that case, the "candidates" often stand out like dogs balls.

Also note that clouds are just one small aspect of Clover 2. The real killer is seeing the per test coverage, i.e. which tests hit which code and the reverse what did *this* test hit. Because as useful as coverage is, well covered != well tested. It is also a great prefactoring aid.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

5 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT