Application Security Testing: The Double-sided Black Box
One of the biggest risks with software security is the opaque nature of verification tools and processes, and the potential for false negatives not covered by a particular verification technique (e.g. automated dynamic testing).
Despite many best practices around secure Software Development Lifecycle (SDLC) most organizations tend to primarily rely on testing to build secure software. One of the most significant byproducts from current methods of testing is that organizations rarely understand what is being tested - and more importantly - what is NOT being tested by their solution. Our research suggests that any single automated assurance mechanism can verify a maximum of 44% security requirements. The NIST Static Analysis Tool Exposition found that all static analysis tools combined reported warnings on 4 out of 26 known vulnerabilities in Tomcat. Because the practice of relying on opaque verification processes is so pervasive, it has become the industry standard and consequently many organizations are content with testing as the primary means to secure software.
Suppose, for example, you hire a consultancy to perform a penetration test on your software. Many people call this testing "black box" based on the QA technique of the same name, where testers do not have detailed knowledge of the system internals (e.g. system code). After executing the test, the firm produces a report outlining several vulnerabilities with your application. You remediate the vulnerabilities, submit the application for re-testing, and the next report comes back "clean" – i.e. without any vulnerabilities. At best, this simply tells you that your application can't be broken into by the same testers in the same time frame. On the other hand, it doesn't tell you:
- What are the potential threats to your application?
- Which threats is your application “not vulnerable” to?
- Which threats did the testers not assess your application for? Which threats were not possible to test from a runtime perspective?
- How did time and other constraints on the test affect the reliability of results? For example, if it the testers had 5 more days, what other security tests would they have executed?
- What was the skill level of the testers and would you get the same set of results from a different tester or another consultancy?
In our experience, organizations aren’t able to answer most of these questions. The black box is double-sided: the tester doesn't understand application internals and the organization requesting the test doesn't know much about the security posture of their software. We're not the only ones who acknowledge this issue: Haroon Meer discussed the challenges of penetration testing at 44con. Most of these issues apply to every form of verification: automated dynamic testing, automated static testing, manual penetration testing, and manual code review. In fact a recent paper describes similar challenges in source code review.
Examples of Requirements
To better illustrate this issue, let’s take a look at some common high-risk software security requirements and examine how common verification methods apply to them.
Requirement: Hash user passwords using a secure hashing algorithm (e.g. SHA-2) and a unique salt value. Iterate the algorithm multiple times.
How common verification methods apply:
- Automated run-time testing: Unlikely to have access to stored passwords, therefore unable to verify this requirement
- Manual run-time testing: Only able to verify this requirement if another exploit results in a dump of stored passwords. This is unreliable, therefore you cannot count on run-time testing to verify the requirement
- Automated static analysis: Only able to verify this requirement under the following conditions:
- The tool understands how authentication works (i.e. uses a standard component, such as Java Realms)
- The tool understands which specific hashing algorithm the application uses
- The tool understands if the application uses unique salt values for each hash
In practice, there are so many ways to implement authentication that it is unrealistic to expect a static analysis tool to be able to verify this requirement across the board. A more realistic scenario is for the tool to simply recognize authentication and point out that secure hashing and salting are necessary. Another scenario is for you to create custom rules to identify the algorithm and hash value and verify they meet your own policy, although in our experience this practice is rare.
- Manual code review: The most reliable common verification method for this requirement. Manual assessors can understand where authentication happens in the code, and verify that hashing and salting meets best practices.
Requirement: Bind variables in SQL statements to prevent SQL injection
SQL Injection is one of the most devastating application vulnerabilities. A recent flaw in Ruby on Rails allowed SQL Injection for applications built on its stack.
How common verification methods apply:
- Automated run-time testing: While run-time testing may be able to find the presence of SQL injection by analyzing behavior, it cannot verify the absence of it. Therefore, automated testing run-time testing cannot verify this requirement completely
- Manual run-time testing: Same limitations as automated run-time testing
- Automated static analysis: Generally able to verify this requirement, particularly if you are using a standard library to access a SQL database. The tool should be able to understand if you are dynamically concatenating SQL statements with user input, or using proper variable binding. There is a chance, however, that static analysis may miss SQL injection vulnerabilities in the following scenarios:
- You use stored procedures on the database and are unable to scan the database code. In some circumstances, stored procedures can be susceptible to SQL injection
- You use an Object Relational Mapping (ORM) library which your static analysis tool does not support. ORMs can also be susceptible to injection.
- You use non-standard drivers / libraries for database connectivity, and the drivers do not properly implement common security controls such as prepared statements
- Manual code review: Like static analysis, manual code review can confirm the absence of SQL injection vulnerabilities. In practice, however, production applications may have hundreds or thousands of SQL statements. Manually reviewing each one can be very time consuming and error prone.
Requirement: Apply authorization checks to ensure users cannot view another user’s data.
How common verification methods apply:
- Automated run-time testing: By accessing data from two different users and then attempting to access one user’s data from another user’s account, automated tools can perform some level of testing on this requirement. However, these tools are unlikely to know which data in a user’s account is sensitive or if changing the parameter "data=account1" to "data=account2" represents a breach of authorization.
- Manual run-time testing: Manual run-time tests are generally the most effective method of catching this vulnerability because human beings can have the domain knowledge required to spot this attack. There are some instances, however, where a runtime tester may not have all of the information necessary to find a vulnerability. For example, if appending a hidden parameter such as “admin=true” allows you to access another user’s data without an authorization check.
- Automated static analysis: Without rule customization, automated tools are generally ineffective in finding this kind of vulnerability because it requires domain understanding. For example, a static analysis tool is unable to know that the “data” parameter represents confidential information and requires an authorization check.
- Manual code review: Manual code review can reveal instances of missing authorization that can be difficult to find with run-time testing, such as the impact of adding an “admin=true” parameter. However, actual verifying the presence of authorization checks with manual code review can be laborious. An authorization check can appear in many different parts of code, so a manual reviewer may need to trace through several different execution paths to detect the presence or absence of authorization.
Impact to you
The opaque nature of verification means effective management of software security requirements is essential. With requirements listed, testers can specify both whether they have assessed a particular requirement and the techniques they used to do so. Critics argue that penetration testers shouldn't follow a "checklist approach to auditing" because no checklist can cover the breadth of obscure and domain-specific vulnerabilities. Yet the flexibility to find unique issues does not obviate the need to verify well understood requirements. The situation is very similar for standard software Quality Assurance (QA): good QA testers both verify functional requirements AND think outside the box about creative ways to break functionality. Simply testing blindly and reporting defects without verifying functional requirements would dramatically reduce the utility of quality assurance. Why accept a lower standard from security testing?
Before you perform your next security verification activity, make sure you have software security requirements to measure against and that you define which requirements are in-scope for the verification. If you engage manual penetration testers or source code reviewers, it should be relatively simple for them to specify which requirements they tested for. If you use an automated tool or service, work with your vendor to find out what requirements their tool or service cannot reliably test for. Your tester/product/service is unlike to guarantee an absence of false negatives (i.e. certify that your application is not vulnerable to SQL injection), but knowing what they did and did not test for can dramatically help increase the confidence that your system does not contain known, preventable security flaws.
About the Author
Rohit Sethi (@rksethi on Twitter) is lucky to work with amazing people at SD Elements, focusing on application security requirements. He has helped improve software security at some of the world’s most security sensitive organizations in financial services, software, ecommerce, healthcare, telecom and other industries. Rohit has built and taught SANS courses on Secure J2EE development. He has spoken and taught at FS-ISAC, RSA, OWASP, Secure Development Conference, Shmoocon, CSI National, Sec Tor, Infosecurity, CFI-CIRT, and many others. Mr. Sethi has written articles for InfoQ, Dr. Dobb's Journal, TechTarget, Security Focus and the Web Application Security Consortium (WASC), has appeared on Fox News Live, and has been quoted as an expert in application security for CNN, Discovery News and Computer World. He also created the OWASP Design Patterns Security Analysis project.
Rohit is great
The primary problem is that appdev teams leave security until the last minute. There is no time to do manual security testing. There is no time to do manual code review. Almost all work has to be fully automated today, which leads to massive gaps. 15 percent of vulns discovered when there's 100 percent coverage means that we're doing something wrong.
If you are an app owner, or know an app owner -- make sure that they know that they need to submit as much information about their risky projects to their appsec leads as soon as possible in the process. Have an automated system that pushes wireframes, source code fragments (e.g. prototypes, frameworks/libraries/components -- especially 3rd-party, et al), and buildable source code (including all target artifacts such as tests and all) to your infosec teams. They'll know what to do with it.