Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Your Code as a Crime Scene

Your Code as a Crime Scene

Measuring software complexity is a popular and common activity among the software development community, judging by the number of tools built over the years and the literature around the subject. Drawing from his blend of engineering and psychology backgrounds, Adam Tornhill proposed to its audience at QCon London to treat their code as a crime scene.

Tornhill believes that current software complexity metrics are imperfect. So he turned to his psychology body of knowledge to look for answers. Geographical offender profiling is an investigate method based on the principle that an offender's home base tends to be within the boundaries defined by the locations of his/her crimes.

Tornhill applies that same principle to code with the help of tools such as CodeCity. The idea behind them is to create geographical representations of code. Districts and buildings map the structure of the code, such as packages or classes. Code attributes, such as the number of lines of code or the number of methods, drive the districts and buildings dimensions. Tornhill then combines that structural information with what he calls spatial movement in code. For that, he needs the help of version control tools.

Version control tools provide a lot of forensic details, such as who, when and where a change was made in the code base. Combining this spatial information with the code structure highlights hotspots. Tornhill claimed that in a case study (400 KLOC, 89 developers, 18000+ commits) hotspots pinpointed 7 of the 8 most defect dense parts: 4% of the code was responsible for 72% of all defects.

A code city with hotspots highlighted.

Using version control information allows for temporal coupling analysis. If two code files change at the same time that may mean that the files are physically coupled, e.g., one class using another. But they may just be logically coupled, usually as a result of copy-paste activities. Without the temporal coupling analysis, these problematic cases are easy to miss.

Temporal coupling analysis may also help in other ways. It can highlight change patterns where members of different teams change different components at about the same time. This pattern may hint at a misalignment between the architecture of a system and the teams' structure, leading to longer cycle times between change request and release to production.

Version control information can also be mined for knowledge owners and component ownership. If a developer is the majority commiter of a given code file or component, then it can be safely assumed that he is the knowledge owner of that component, even if he does not belong to the team responsible for it. This also means that the "hit by a bus" factor can be accounted for and mitigated. In a more extreme case, when the knowledge owner is no longer at the company, a knowledge gap may occur. These techniques help to identify those gaps and close them.

Version control forensics shows the effective ownership of components.

Tornhill is writing a book, currently in Beta status, on this subject. “The Pragmatic Bookshelf” will publish it and the estimated release date is 2015-03-10.

Rate this Article