.NET Static Analysis and Parasoft dotTEST
Static Analysis in .NET
Wouldn't it be nice to receive a gentle tap on the shoulder if you're about to add code that will come back and haunt you later—in the form of a bug that could take days to find and fix later in the development process, code that's virtually impossible for your team members to reuse and extend, or a defect that impacts security, reliability, or performance in the field?
Static analysis can give you that kind reminder . . . and it's very feasible to implement for .NET development.
What is Static Analysis?
Static analysis is one of those funny terms like “web master” that can have widely different meanings. A working definition of “static analysis” is “the analysis of computer software that is performed without actually executing the software being tested.” This definition clarifies that static analysis is broader than we might expect, but limits the scope by excluding things like functional testing.
Things that fall under the static analysis umbrella can include:
- Code beautification
- Peer review (a.k.a. manual code review and code inspection)
- Pattern-based code scanners
- Flow-based code scanners
- Metrics-based code scanners
- Compiler and other build-related output
This article will discuss each of the above techniques, as well as where/when to use them and why they're helpful.
It's not surprising that static analysis tends to get a bad rap when it's relegated to a tool that simply checks whether curly brackets are in the proper place. This kind of use is potentially helpful, but it's far from being static analysis' most powerful capability.
Peer or manual code review is based on the idea that humans should look over each other’s shoulders to see if code accomplishes what it is supposed to do.
Peer code review remains the best approach for finding code defects. On average, 60% of defects can be removed via code reviews (Boehm and Basili, “Top Ten Software Defect Reduction List,” Computer, January 2001). This is hardly surprising, since code reviews use the finest analysis instrument available: the human brain.
Although you obviously can't automate peer code review, you can take advantage of a number of technologies to make peer review more efficient and help you move it beyond nitpicky low-level issues such as syntax. For instance, you can automate the checking of these low-level issues, as well as automate code review preparation, notification, and tracking.
Pattern-based analysis defines code patterns that are either desirable (or undesirable) and make sure that they present (or absent) in your code, respectively.
In other words, static analysis can look for good patterns as well as bad ones. For example, it can make sure my code prints out a copyright statement, or it can check for formatting issues like bracket placement and case sensitivity. This approach helps teams check whether .NET code adheres to a policy or a set of coding practices.
In addition to finding syntax problems, pattern-matching static analysis can uncover potential causes of bugs. This doesn't necessarily reveal existing bugs, but rather points you to their breeding ground. In this way, pattern-based analysis can be used for error prevention, which is key to real productivity and quality improvements.
Used properly, static analysis can help developers to identify potential logical flaws as well as to detect errors such as NullReferenceExceptions and resource leaks in addition to those simple things like beautification.
Even the most basic static analysis—checking whether code complies with industry-accepted standards— can instantly expose problems that would require hours to find with unit testing, manual inspection, or other verification techniques.
It’s critical that you see static analysis coding standards as a means of preventing errors—not detecting them. Many developers are disappointed if a pattern-based static analysis violation doesn’t point them to an obvious bug. When they explore a violation and find an error-prone construct rather than an error, they think that the static analysis rules aren’t useful, eventually stop investigating violations, and later stop performing static analysis altogether.
This speaks to a fundamental problem with the software industry: Most development and testing teams are concerned with removing errors, but not with preventing them.
One really amazing thing about pattern-based static analysis is that it tends to improve developers. It looks over their shoulders and nags them when they do something in a way that isn’t ideal.
For example, a developer may write code that includes a try/catch block for accessing a database, but forget to free up resources using a finally block. With the proper static analysis rule, the developer is notified as he writes the code so he can fix it. After being “reminded” of this several times, eventually the developer simply changes the way he writes code.
While pattern-based static analysis looks for specific patterns in a particular file or class, flow analysis looks for patterns by trying to follow particular paths through the application—without actually running it.
Flow-based tools simulate application execution, look for possible paths through the code based on the logic, and may even try to inject bad data. This finds real bugs in your application by finding paths that could trigger runtime defects such as NullReferenceExceptions, resource leaks, and security vulnerabilities such as SQL injections.
When you extend pattern-based static analysis with data flow analysis and then expand it further using custom coding standards, it becomes even more powerful. Developing custom standards helps you entirely eliminate many very complicated and important errors. For example, they can be used to expose errors that are unique to your application, such as security vulnerabilities and API usage errors. Such errors are very difficult to find through manual testing or inspection, and exponentially more difficult to fix if not detected until runtime.
Using an integrated set of static analysis techniques to expose these critical bugs early in the SDLC saves hours of diagnosis and potential rework.
Perhaps the oldest version of static analysis is metrics-based analysis, which measures code aspects like complexity or simply the number of lines or methods in a file.
Metrics analysis attempts to meet two goals: understand what’s going on in the code, and find possible problems. When we started doing static analysis 20 years ago, there were a lot of metrics. We used them to diagnose where an irreproducible problem might be stemming from, then used a debugger to debug locations suggested by the metrics.
The quality of metrics-based analysis depends on what you are looking for. Metrics are particularly helpful at helping you gain a broad understanding of your code. For example, if you measure the number of lines in your application's files and you notice some giant files cropping up, then your design probably needs improvement. Components should be very discrete, having known inputs and producing known outputs. If there is a lot of complicated logic in the middle of them, it's probably a good time to revisit your components and consider breaking them down.
An often-overlooked area of static analysis is the output from your compiler. In the past, this output was frequently ignored or glossed over, but developers interested in quality and stability are finding that cleaning compiler errors prevents oddball problems in the field.
The next time you see a compiler error and are tempted to pass over it, think about whether you know enough to write your own compiler. If you don't, then fix the error.
Static Analysis in .NET
Now that we’ve discussed various methods of static analysis, let’s focus on pattern and flow based analysis in .NET. Parasoft has created a tool called dotTEST that performs static analysis, among other things.
Parasoft dotTEST is integrated into the Visual Studio IDE, where you can select projects or files in the VS Solution Explorer and run dotTEST commands on the selected resources. From VS you can view analysis results, reassign and suppress errors, and so forth.
(Click on the image to enlarge it)
How dotTEST Works
dotTEST performs static analysis by inspecting both IL code and source code. Examining the IL code allows dotTEST to analyze all .NET languages, though some rule checks must still be performed at the source code level.
dotTEST uses various testing techniques, such as parsing C# code, using .NET Reflection API, and reading .NET assemblies. These techniques allow for static analysis, metrics, flow analysis (BugDetective), unit testing, and unit test generation.
One of the most advanced dotTEST features is its flow analysis, which constructs appropriate control flow graphs and analyzes them for NullReferenceExceptions, resource leaks, insecure operations, and other possible exception situations.
dotTEST can also compute unit test coverage or apply stubs during unit testing by rewriting IL code on-the-fly, just before it's compiled to native code during the application run.
Pattern-based Static Analysis in dotTEST
dotTEST comes preconfigured with hundreds of built-in rules— including guidelines from Microsoft's .NET Framework Design Guidelines, from Effective C# and .NET Gotchas books, and the experiences of software developers in many companies.
Over 440 pattern-rule pairs are pre-loaded into dotTEST; these rules cover:
- Common mistakes
- API design
- API usage
- Web Application Best Practices
- C# Best Practices
- Resource Leaks and Memory Usage
Each rule explains its own importance, the benefit or risk of following/not following it, and how to fix the patterns it discovers. Rule severity levels are pre-set based on potential damage, and are customizable. Rule groups also come packaged for scenarios such as Security, OWASP, NIST SAMATE, and IEC 62304.
Compliance requirements notwithstanding, it’s important to check whether code follows these rules— even after the code has been written.
Pattern-Based Static Analysis - Example
Let's assume that pattern-based static analysis reveals some code that violates the "Avoid static collections" rule.
This rule is important because it identifies code that could cause memory leaks. Static collection objects (i.e. ArrayList etc.) can hold large numbers of objects, making them candidates for memory leaks. If you put a short-lived object into a "static" collection and forget to remove it, that object will be referenced by the collection for the life of the program.
Any resulting memory leaks could be uncovered through profiling or load testing, but that requires designing and implementing tests, then tracking the problem down to the appropriate line of code. Using a pattern-matching static analysis tool, such code can be detected automatically, in seconds.
The following screenshot shows how dotTEST’s pattern-matching static analysis identified a situation where the developer intended to use logical and, but instead used bit and.
(Click on the image to enlarge it)
Since bit and is not short-circuited, this results in the evaluation of ssn, even when ssn is null and causes an exception to be thrown. Although the exception is obvious in this simple scenario, it could easily escape testing in more complex cases.
To ensure that the process is as effective and streamlined as possible, custom IL-level and C# rules can also be created in Parasoft graphical rule-creation interface (RuleWizard) to enforce specific project and organizational requirements, and to prevent the recurrence of application-specific defects.
Rule names and severity categories can be mapped to match your team’s internal coding policies and priorities. Moreover, case-specific suppressions provide a systematic way to follow rules in general, but still make some exceptions that are deemed acceptable by you or your team.
Flow Analysis in dotTEST
For flow-based analysis, Parasoft’s BugDetective uses several analysis techniques, including simulation of application execution paths, to identify paths that could trigger runtime defects. Defects detected include using null reference exceptions, division by zero, and resource leaks.
For example, the following graphic shows three issues that BugDetective found in a sample banking application:
(Click on the image to enlarge it)
Taking a closer look at the Avoid NullReferenceException error, note that BugDetective shows the complete path that leads to the issue it's pointing out.
In this case, cust was set to null on line 64. On line 68, there is a call to LookupCustomerName. An exception was thrown from within that method on line 48. This is indicated by the red ball. The control then goes to the catch block where a null reference occurs on line 74.
This problem was likely to have escaped normal testing.
The second error uncovered is another example of a bug that is very hard to reproduce and debug. This example indicates that managed resource resourceCache is being used in a destructor, and the method Changed is called from a destructor (not shown). The problem here is that, due to the nature of garbage collector, it is possible that resourceCache has already been destructed at this point. We have seen this issue in several applications and it can lead to crashes non-deterministically…and often when the applications are about to close!
(Click on the image to enlarge it)
Static analysis has a broad set of capabilities to offer the .NET world. It can enforce pattern-based rules, whether they're based on proven standards or custom patterns that help you identify application-specific defects. It can quickly scan your entire code base and target all instances of high-risk code that violates your selected rule set.
Nevertheless, some defects cannot be detected by this analysis technique. Many defects arise due to interactions among different methods and classes, and also depend on the actual path of execution. Moreover, such defects often escape traditional testing efforts (such as unit testing and application-level testing) because the exceptional conditions are so difficult to reproduce. Even with 100% statement coverage, there are many paths that do not get covered. So it's helpful to have an automated tool that simulates a large number of paths through the code, looking for potential defects.
The flow analysis feature of dotTEST does exactly that. By simulating potential execution paths, it reveals where bugs are likely to crop up at runtime, allowing you to prevent their inception long before your project reaches the field.
Combine these powerful methods with policies to enforce static analysis, and you have a proven way to achieve key software development benefits:
- Detect bugs or potential bugs that impact reliability, security, and performance.
- Enforce organizational design guidelines and specifications (application-specific, use-specific, or platform-specific) and error-prevention guidelines abstracted from known specific bugs.
- Improve code maintainability by improving class design and code organization.
- Enhance code readability by applying common formatting, naming, and other stylistic conventions.
Regardless of the variety, employing static analysis while developing with .NET allows you to prevent defects by ferreting out their root causes and preventing the patterns that give rise to them.
For more details, see:
- Integrated Error-Detection Techniques Find More Bugs in .NET Applications—Demonstrates how to automate and synchronize error-detection techniques—including static code analysis, data flow analysis, and unit testing—to more effectively find defects in .NET applications.
- Static Analysis Best Practices—Explains how pattern-based analysis, data flow analysis, and metrics can help your team improve code security, reliability, performance, and maintainability—and how to get started as painlessly as possible.
- Data Flow Static Analysis: Static Analysis on Steroids—Explains how data flow analysis can be applied to bolster both your static analysis and unit testing efforts.
- Parasoft's Static Analysis Center—Provides an overview of Parasoft's static analysis technologies for .NET, Java, C, C++, FDA, safety-critical, and secure application development.
About the Author
As Evangelist and Solutions Architect for ParaSoft, Arthur Hicken works very closely with the company's upper management in making strategic decisions. He has been involved in automating various practices at Parasoft for almost 20 years. He has worked on projects including database development, the software development life-cycle, web publishing and monitoring, various aspects of software build automation, and integration with legacy systems. In addition, Mr. Hicken's experience also includes supervising database technology, data mining, data warehousing, database marketing, and he has developed a state-of-the-art internal database system.
Hicken has worked with IT departments in companies such as Cisco, Vanguard, and Motorola to help improve their software development practices. He has developed and conducted numerous in-house technical training courses at ParaSoft. He has also developed and instructed several training courses for ParaSoft clients. As an expert in his field, Arthur has been quoted in Business 2.0, Internet Week, and CNET news.com regarding Web site quality issues.