InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Crap4J Seeks to Use Algorithms to Determine Code Quality

Posted by Ian Roughley on Oct 24, 2007

Sections
Process & Practices,
Architecture & Design,
Development
Topics
Code Analysis ,
Java

The goals of a new project, crap4j, are clear:

There is no fool-proof, 100% accurate and objective way to determine if a particular piece of code is crappy or not. However, our intuition backed by research and empirical evidence is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit the "This is crap!" response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into "Oh crap!"

Expanding upon the initial reactions, a more precise measurement is needed. Crap4j provides a single number that combines complexity and test code coverage.

Given a Java method m, CRAP for m is calculated as follows:

CRAP(m) = comp(m)^2 * (1  cov(m)/100)^3 + comp(m)

Where comp(m) is the cyclomatic complexity of method m, and cov(m) is the test code coverage provided by automated tests (e.g. JUnit tests, not manual QA). Cyclomatic complexity is a well-known and widely used metric and itís calculated as one plus the number of unique decisions in the method. For code coverage we use basis path coverage.

Like any metric, this number needs to be used with caution and not blindly followed, but it does provide a starting point to facilitate change and a way to pinpoint places in the code that are in more need of updating than others.

Low CRAP numbers indicate code with relatively low change and maintenance risk because itís not too complex and/or itís well-protected by automated and repeatable tests. High CRAP numbers indicate code that ís risky to change because of a hazardous combination of high complexity and low, or no, automated test coverage.

Crap4j can be downloaded from www.junitfactory.com/crap4j/update/, and is currently a plug-in for Eclipse. From more information, JUnitFactory has provided an introductory article.

Download link by Martin Gilday Posted
Re: Download link by Bob Evans Posted
Visualization by Pete Moore Posted
Re: Visualization by Jeffrey Fredrick Posted
Re: Visualization by Pete Moore Posted
  1. Back to top

    Download link

    by Martin Gilday

    The correct Eclipse download site is www.junitfactory.com/crap4j/update/

  2. Back to top

    Re: Download link

    by Bob Evans

    Actually, Crap4j has it's own home now at www.crap4j.org. You can download it from there at www.crap4j.org/download.html

  3. Back to top

    Visualization

    by Pete Moore

    Clover 2 provides a very neat visualization of complexity vs coverage (in html and in eclipse).

    One could use it to identify "crap", however we think this is a negative way to look at the world. We prefer to think of it as identifying areas where more testing might be most valuable.

    E.g. the "risks" cloud for Lucene looks like this downloads.atlassian.com/software/clover/samples...

    What's really interesting is if you compare the above cloud to the one without complexity the difference is startling downloads.atlassian.com/software/clover/samples...

    Anyway, install the eclipse plugin from update.atlassian.com/eclipse/clover/

  4. Back to top

    Re: Visualization

    by Jeffrey Fredrick

    Hi Pete!

    Someone else at Agitar reminded me about the Clover tag clouds again today. They look cool... but looking closer I don't really like them. I actually blogged about this today (Visualizing Complexity and Coverage).

    What does it look like when have several thousand classes in your project?

    The quick wins tab for the Lucene example does look cool, and much easier to read than the project risks tab. Is that typical across projects or something you only start to see when you have substantial coverage?

    Oh, and what we've found is that the difference between "crap" and "risk" is if you inherited the code or not! ;)

  5. Back to top

    Re: Visualization

    by Pete Moore

    The colored cloud is a great way of displaying dense multidimensional data - the Project Risks cloud has complexity on one axis (size) and coverage on the other (color). In PMD's project risks cloud there are 545 classes and it quite clearly throws up a couple of dozen candidates for investigation.

    It scales to large as well as to tiny projects because it is relative to the project it is run on. E.g. if you are at NASA and your least covered class has 95% coverage, it will be bright red. Similarly if every class has high complexity, only the worst offenders will be large.

    Quick wins has total elements on the size axis and uncovered elements on the color axis. Big red = lots of uncovered elements. If you look at PMD's quick wins cloud it throws up only a couple of candidates.

    The quick wins cloud is often "easier to read" as you noticed for lucene, it is mostly useful when you are just trying to increase your overall coverage number. In most situations not a good thing(tm). We jokingly refer to project risks as the "developers report" and quick wins as the "managers report" ;) That said it is interesting (and fun) to compare quick wins to other clouds. At the moment there is only risk, but in the next major release we would like to make them user configurable. E.g. you could use crap4j's algorithm in a cloud.

    An important thing to keep in mind, is that a visualization like a cloud (or the treemap in the eclipse plugin) are just launching points for further investigations, possible candidates for more testing or a refactor. They are not an end in themselves and never going to be a perfect representation of a projects risk (or anything else for that matter). Visualizations can definitely be useful for gleaning insight into an unfamiliar project, however, imho they are most valuable on the project one is intimately familiar with. In that case, the "candidates" often stand out like dogs balls.

    Also note that clouds are just one small aspect of Clover 2. The real killer is seeing the per test coverage, i.e. which tests hit which code and the reverse what did *this* test hit. Because as useful as coverage is, well covered != well tested. It is also a great prefactoring aid.

Educational Content

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.