BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Coverity: Open Source Code Has Fewer Defects than Commercial One

Coverity: Open Source Code Has Fewer Defects than Commercial One

Leia em Português

This item in japanese

A Coverity study concludes that open source code using static analysis has on average a lower number of defects than commercial code, but they are on par when it comes to code of similar sizes.

Coverity Scan is a public-private research project focused on open source code integrity and started at the U.S. Department of Homeland Security’s initiative in collaboration with Stanford University in 2006. Coverity Scan has used the Coverity Static Analysis tool to evaluate and improve the code quality of over 300 open source projects over the last 5 years. For example, the tool helped fixing over 6,000 bugs in open source code in 2006.

The 2011 Scan Report (PDF) recently published concludes that on average open source projects contain less defects than commercial ones. The source code of 45 of the most active open source projects was analyzed during 2011, having in total over 37 millions lines of code. Only the high and medium-impact defects were considered. The study does not include data regarding defects found through Q&A testing or post-deployment. All these projects are included in the Coverity Scan program which makes a static analysis of their code.

Most open source projects analyzed had 100k-500k lines of code (LOC), while 2 projects had over 7M lines, the total being 37,446,469 LOC with an average of 832,000 LOC. The number of defects found in open source projects was 0.45 defects/1,000 lines of code, while the industry’s average is around 1 defect per 1,000 lines of code for companies not using automated testing such as static analysis.

The scan tracked 14 types of defects, the top 5 defects found in open source code being:

Defect Quantity Impact

Control Flow Issue

3,128

Medium

Null Pointer Dereferences

2,818

Medium

Uninitialized Variables

2,051

High

Memory - Corruptions

1,551

High

Error Handling Issues

 1,535

Medium

The study also scanned over 300M LOC from 41 commercial projects using static analysis and spanning multiple verticals and code sizes, with an average of 7.4M LOC per project, concluding that when it comes to projects of the same sizes, the quality of open source code is on par with proprietary ones. For example, Linux 2.6 has almost 7M LOC and 0.62 defect density, while the average for proprietary code is 0.64. Linux usually has a lower defect density but its codebase grew from 5.3M to 6.8M LOC in 2011. PHP 5.3 and PostgreSQL 9.1 are references with 0.2 and 0.21 defects per 1,000 LOC.

The overall conclusion of the study is that automatic testing, including static analysis, lowers the number of code defects, something that is quite obvious.

Rate this Article

Adoption
Style

BT