Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Crap4J Seeks to Use Algorithms to Determine Code Quality

Crap4J Seeks to Use Algorithms to Determine Code Quality

This item in japanese


The goals of a new project, crap4j, are clear:

There is no fool-proof, 100% accurate and objective way to determine if a particular piece of code is crappy or not. However, our intuition backed by research and empirical evidence is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit the "This is crap!" response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into "Oh crap!"

Expanding upon the initial reactions, a more precise measurement is needed. Crap4j provides a single number that combines complexity and test code coverage.

Given a Java method m, CRAP for m is calculated as follows:

CRAP(m) = comp(m)^2 * (1  cov(m)/100)^3 + comp(m)

Where comp(m) is the cyclomatic complexity of method m, and cov(m) is the test code coverage provided by automated tests (e.g. JUnit tests, not manual QA). Cyclomatic complexity is a well-known and widely used metric and itís calculated as one plus the number of unique decisions in the method. For code coverage we use basis path coverage.

Like any metric, this number needs to be used with caution and not blindly followed, but it does provide a starting point to facilitate change and a way to pinpoint places in the code that are in more need of updating than others.

Low CRAP numbers indicate code with relatively low change and maintenance risk because itís not too complex and/or itís well-protected by automated and repeatable tests. High CRAP numbers indicate code that ís risky to change because of a hazardous combination of high complexity and low, or no, automated test coverage.

Crap4j can be downloaded from, and is currently a plug-in for Eclipse. From more information, JUnitFactory has provided an introductory article.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Download link

    by Martin Gilday,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    The correct Eclipse download site is

  • Re: Download link

    by Bob Evans,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Actually, Crap4j has it's own home now at You can download it from there at

  • Visualization

    by Pete Moore,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Clover 2 provides a very neat visualization of complexity vs coverage (in html and in eclipse).

    One could use it to identify "crap", however we think this is a negative way to look at the world. We prefer to think of it as identifying areas where more testing might be most valuable.

    E.g. the "risks" cloud for Lucene looks like this

    What's really interesting is if you compare the above cloud to the one without complexity the difference is startling

    Anyway, install the eclipse plugin from

  • Re: Visualization

    by Jeffrey Fredrick,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Pete!

    Someone else at Agitar reminded me about the Clover tag clouds again today. They look cool... but looking closer I don't really like them. I actually blogged about this today (Visualizing Complexity and Coverage).

    What does it look like when have several thousand classes in your project?

    The quick wins tab for the Lucene example does look cool, and much easier to read than the project risks tab. Is that typical across projects or something you only start to see when you have substantial coverage?

    Oh, and what we've found is that the difference between "crap" and "risk" is if you inherited the code or not! ;)

  • Re: Visualization

    by Pete Moore,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    The colored cloud is a great way of displaying dense multidimensional data - the Project Risks cloud has complexity on one axis (size) and coverage on the other (color). In PMD's project risks cloud there are 545 classes and it quite clearly throws up a couple of dozen candidates for investigation.

    It scales to large as well as to tiny projects because it is relative to the project it is run on. E.g. if you are at NASA and your least covered class has 95% coverage, it will be bright red. Similarly if every class has high complexity, only the worst offenders will be large.

    Quick wins has total elements on the size axis and uncovered elements on the color axis. Big red = lots of uncovered elements. If you look at PMD's quick wins cloud it throws up only a couple of candidates.

    The quick wins cloud is often "easier to read" as you noticed for lucene, it is mostly useful when you are just trying to increase your overall coverage number. In most situations not a good thing(tm). We jokingly refer to project risks as the "developers report" and quick wins as the "managers report" ;) That said it is interesting (and fun) to compare quick wins to other clouds. At the moment there is only risk, but in the next major release we would like to make them user configurable. E.g. you could use crap4j's algorithm in a cloud.

    An important thing to keep in mind, is that a visualization like a cloud (or the treemap in the eclipse plugin) are just launching points for further investigations, possible candidates for more testing or a refactor. They are not an end in themselves and never going to be a perfect representation of a projects risk (or anything else for that matter). Visualizations can definitely be useful for gleaning insight into an unfamiliar project, however, imho they are most valuable on the project one is intimately familiar with. In that case, the "candidates" often stand out like dogs balls.

    Also note that clouds are just one small aspect of Clover 2. The real killer is seeing the per test coverage, i.e. which tests hit which code and the reverse what did *this* test hit. Because as useful as coverage is, well covered != well tested. It is also a great prefactoring aid.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p