Threat Modeling Express
Many software developers understand the importance of finding bugs early in the Software Development Life Cycle (SDLC). The community has produced several methodologies to assist in embedding security into the SDLC, including (but not limited to):
- Open Software Assurance Maturity Model
- Building Security-In Maturity Model (BSIMM)
- Microsoft’s Secure Development Lifecycle (SDL)
All of these approaches suggest some variant of threat modeling as a design level activity. At a broad level, threat modeling is simply the process of looking at a system’s design from an attacker’s perspective. Generally, we can decompose threat modeling into a set of major steps:
- Evaluate what a potential attacker may be interested in.
- Enumerate what potential vulnerabilities exist in the system.
- Evaluate the risk of these potential vulnerabilities: the likelihood that they exist and can be successfully exploited and how can they help the attacker achieve the goals in step 1.
- Determine which countermeasures should the system contain to stop the vulnerabilities.
Threat modeling provides several key benefits:
- Prevent vulnerabilities in design rather than later in the SDLC, thereby saving cost.
- Understand which attacks and countermeasures are actually relevant to your system.
- Prioritize building security countermeasures by risk.
The concrete activities for threat modeling differ by implementation: A seasoned penetration tester will create an informal threat model in her head before she attempts to break into a target. When we try to bring threat modeling into team-based application development, relying on a single security expert’s in-memory model breaks down. As a result, some organizations have attempted to formalize the threat modeling process. Microsoft’s process is arguably the most popular and is supported by their SDL Threat Modeling Tool.
Formal Threat Modeling
Over the years we’ve had the opportunity to perform and observe threat modeling for several clients. In many cases, a comprehensive threat model can be incredibly time consuming and inefficient. In many cases, organizations elect to use a formal Threat Risk Assessment process concentrated at the application level. Typically these organizations will perform the following steps:
- Define all types of data in the system including the Confidentiality, Integrity and Availability (CIA) requirements – e.g. credit card data, authentication credentials, user contact information, etc.
- Define all the threat agents – e.g. malicious external hacker, malicious internal hacker, activist, corporate espionage, etc.
- Define all the use cases –e.g. open an account, make a booking, etc.
- For each use case, define how application flows between system components. Components can be high level, such as physical servers, or low level such as design-level constructs - e.g. business logic layer, presentation layer, etc. Also document which data will be accessed in each use case and the corresponding CIA requirements.
- For each operation, reference a list of well known attack vectors and see if those attack vectors apply to your system. For example, if a use case involves database access then SQL injection is a potential attack vector. Reference external sources such as the OWASP ASVS, WASC Threat Classification, or SANS/CWE Top 25 for an index of threats to consider.
- Evaluate the risk of each potential attack and define countermeasures, such as using stored procedures to protect against SQL Injection.
- Document all of this in a report and hand it to developers when they start coding.
Although the benefits of threat modeling may be easy to articulate, the cost of performing such a time-consuming activity often means that development shops are unwilling to take on the burden of a threat model. The result, unfortunately, is that in our experience few organizations actually perform threat modeling.
The Anti-Process Movement
As the agile movement catches on, development shops are increasingly pushing back against heavy processes. A series of articles from Yishan Wong, the former Director of Facebook, points out the growing trend: "Process imposed externally ('from above') is more difficult to break down and tends to be unnecessarily enshrined, thus producing extra organizational inertia". Even organizations using traditional waterfall development process are unlikely to add large, time-consuming processes unless they're fully convinced of the Return on Investment (ROI). Although you can probably make the argument that threat modeling is well worth the money, it's sure to be an uphill battle. As one of the students in our early application security classes put it, "Threat modeling looks wonderful but there's no way the business will push deadlines for this".
As an industry, we're stuck in a bit of a rut: not that threat modeling is inefficient and increases application security costs, yet many organizations are unwilling to take on the overhead of a new time-consuming process.
Agile Threat Modeling
After surveying our clients and industry at large for both threat modeling and more general risk assessment processes, we found a shining example of ingenuity: Thomas Peltier's Facilitated Risk Analysis Process (FRAP). FRAP is to risk assessments what agile is to software development. FRAP consists of having a sit down meeting between stakeholders and simply enumerating major risks against the enterprise. FRAP is to formal risk assessments what agile roughly is to waterfall: less formal, less comprehensive, less time-consuming, less documentation, and more consensus-oriented. We decided at that point to take the FRAP approach for risk assessments and move it to threat modeling. The result is a process we’ve coined "Threat Modeling Express".
Introducing Threat Modeling Express
A Threat Modeling Express session is a single, four hour meeting where key stakeholders collaboratively define threats and countermeasures according to business priorities.
Threat Modeling Express (TME) is derived from a couple of core philosophies. The TME should:
- Be a group activity
- Be quick and rely on information that may not be complete
Threat modeling captures an understanding of what’s important to the business, details of an application's underlying technology, and relevant attack vectors. In our experience, the most time-efficient approach is to gather the stakeholders for all three factors into a room to discuss, business/application/product owners (often known amongst technology workers as "the business"), architects and/or lead developers, and an application security subject matter expert. TME, therefore, is fundamentally group-based. It helps build buy-in from all three stakeholders and decreases resistance to decisions later on whether or not to include security controls within the application. We recommend scheduling a four hour meeting for this discussion. If scheduling constraints make this impractical then try to arrange the longest meeting that you can – in many cases, this may be one hour.
A formal threat model should document all the application’s data types, technology stack, use cases, and attack vectors. In our experience, collecting all of this information is nearly impossible for a large enterprise application for the following reasons:
- Few organizations catalogue all of their data types. The ones that do catalogue all of their data types struggle to keep their catalogues up-to-date with changes to their applications
- Few organizations maintain tidy repositories of all of the application’s use cases. This knowledge is often spread amongst several people
- In-depth documentation of the application’s technology stack is rarely available by a single source
- Nobody is enough of an expert to enumerate all possible attack vectors. Moreover, any repository of well-known attack vectors such as the Common Weakness Enumeration (CWE) is too large to digest in a reasonable amount of time
TME takes the approach that we should rely on information that’s easily available. In other words, bring any readily available documentation to the meeting but otherwise rely on what’s in peoples’ heads. You can always follow-up on major gaps of understanding after the meeting. By reducing the upfront requirements of a TME, we make the process quicker and therefore easier to repeat as our understanding of the system, its uses, and attack vectors grow. TMEs are therefore analogous to "Release early, release often" principle in Agile processes. Another consequence of the "be quick" philosophy is that TME is light on producing documentation. The end result should be two simple prioritized lists: threats and countermeasures. If you need more documentation, for example for audit purposes, consider taking meeting minutes or recording the session.
Focus on threats that won’t be caught elsewhere
TME sessions are the perfect time to discuss domain-specific threats[i], such as privilege escalation. Discussing domain-agnostic threats, such as cross site scripting and SQL injection, can be useful if the only security activity you’re doing today is blackbox penetration testing. On the other hand, if your developers are already aware of the dangers of SQL injection and you have other processes such as static analysis to detect SQL injection, then discussing SQL injection in your Threat Modeling Express is not an effective use of time. In our experience, the most effective use of time-crunched TME sessions (particularly those under two hours) is to discuss the kinds of threats that your existing tools won’t automatically detect.
Threat Modeling Express Steps and Case Study
In the following section we document the steps of a TME in detail. In order to provide context, we introduce a single case study derived from a mix of real TMEs that we performed with our clients.
Eastcoast United Bank is one of nation’s largest financial institutions. Recently, Eastcoast United’s chief security officer made SDLC security a strategic initiative. As part of the larger project, we facilitated TME sessions across several lines of business. The business banking division was rolling out a new version of their existing main Internet facing application, Online Business Banking (OBB), and wanted to assess the risks at the design stage.
TME Step 1: Define Goals – Determine the reason for hosting a TME session
The main reasons for hosting a TME session are either to help build security into the applications design or to help guide penetration testing and/or source code review to focus on the threats that are most relevant to the business.
Case Study Step 1:
Our goal was to help identify major security risks at the design stage.
TME Step 2: Gather information
Get up-to speed on the application’s purposes, use cases, deployment and user base. For existing applications, a really effective way to do this is to use the application itself.
Case Study Step 2:
Since the application was already built, we prepared for the TME session primarily by actually using the application and casually speaking to one of the developers. We were able to learn important information in a short period of time:
- The application’s purpose is for their clients’ finance department to oversee business banking transactions and administer access to other business banking
- Major use cases included:
- Viewing account transaction history, including payments and deposits
- Run reports on specific transaction types, user types, including looking for irregular activity
- Administer user access for the other business banking applications such as payroll, bill payments, alerts, wire transfers, etc.
- Export data to major accounting software packages
- The application was Internet facing and deployed across the country. Users were primarily the financial and accounting departments of small and medium-sized businesses across many different industry verticals
TME Step 3: Host Meeting
After your prep-work for the TME is done, host the meeting. Try for a four hour window at first and keep decreasing the window until you’re able to successfully schedule the meeting. Tell your meeting attendees exactly what you propose to do ahead of time and suggest that they come to the meeting with as much background as possible to make effective use of time. Make sure that you have somebody representing the views of "the business", the development team, and security. Note that for large applications, you may need several tech leads or architects representing different technical components. Make sure to appoint a facilitator, who will most likely be you if you’re driving the TME process.
Case Study Step 3:
Finding a four hour meeting window at Eastcoast United was next to impossible. Instead we booked a two hour meeting and invited representatives from the application security practice, the risk practice, an architect, a lead developer, and a business analyst.
TME Step 4: Enumerate Threats
You should have an understanding of use cases from step 2. Explain each use case to the attendees and then say, "if I were a motivated attacker how could I exploit this use case?". The key here is NOT to talk about technology specific attacks but rather think about exploiting the process. It often helps to look at a prioritized list of possible attacker motivations. For example:
- Cause harm to human safety
- Financial gain
- Steal personal records
- Gain a competitive advantage
- Attack organizational stakeholders
- Diminish ability to make decisions
- Disrupt operations
Examining the threats in a technology agnostic manner allows you to focus on "what could happen" rather than how. Perhaps more importantly, it allows the business to participate without getting lost in technical jargon. Spend more time on threats with higher attacker motivations and less time on lower motivations. Write each threat on a whiteboard or flipchart as you discuss it. At the end of this step you'll have a series of business logic threats against your application that you can revisit at every new release, penetration test, or source code review.
Time box the discussion for each threat at five minutes. Park any longer discussions until the end of the meeting and have the facilitator make a temporary decision if necessary.
It’s worth re-emphasizing that you shouldn’t talk about specific technical attacks here. In our experience, this is difficult for some security personnel and developers to grasp; they’ve been trained to think about technical problems and it can be hard to look at a high level threat without delving into the nitty-gritty "how". Just remind the meeting attendees that this portion of the meeting requires everyone’s input, including a non technical audience. Moreover, the list of threats will survive past the upcoming release and will still be relevant in the future when new technical attacks emerge.
Case study step 4:
Eastcoast United developers disagreed about attacker motivations. While it was clear that financial gain was the top attacker motivation, it was less-than clear about what came next. Rather than spend wasting precious time on building consensus, we heard everyone’s take and forced a working prioritization:
- Financial gain
- View other user’s financial data / steal personal records
- Cause financial harm to Eastcoast United’s clients
We then began looking at specific use cases and mapped them to the attacker motivations. For example, the use case "Viewing account transaction history, including payments and deposits" we came up with the following:
- For the "Financial gain" motivation some of the threats we determined were creative:
- Malicious creditor attempts to modify payment history in order to deny that a payment was actually made, thereby getting paid twice
- Malicious employee illicitly views data of another employee
- Malicious employee illicitly views payroll or contractor payment disbursements in order to ask for a pay increase
- For the "View other user’s financial data / steal person records" use case, the threat was more straightforward:
- Malicious user views another user’s account transaction history
We continued this for every use case and every attacker motivation. We made sure not to reject any ideas, no matter how far-fetched they were. In several cases we seemed to be spending too much time on a particular use case/attacker motivation pair so we parked the issue and moved forward. In other cases, it was difficult to see any obvious mapping of an attacker motivation to a use case.
Ultimately, we produced a list of threats that Eastcoast United could re-use on every application security initiative it undertook.
TME Step 5: Map Threats to Vulnerabilities
At this point you can excuse the business representative from the meeting. Now you focus on the "how" - figure out which technical vulnerabilities could enable the threats.
For each threat from step 4 determine if/how the threat can be enabled. Do not concentrate on whether or not you have sufficient controls in place to prevent the vulnerability. For example, even if you make proper use of stored procedures in the database you should still consider SQL injection to be a threat.
Here it’s useful to have a list of potential attack vectors against your system. If you’re dealing with a web application then the Web Application Security Consortium’s Threat Classification project is a good starting point. As with step 4, time box the discussion to five minutes per threat and park any issues that take longer.
If you already have tools like dynamic or static analysis software to catch domain-agnostic vulnerabilities use this time to focus on domain-specific vulnerabilities such as parameter manipulation.
Case study step 5:
The security-aware staff at Eastcoast United jumps at the opportunity to start listing the vulnerabilities that might enable the higher level threats.
Some of vulnerabilities they found included:
- Denial of service through large requests
- SQL injection
- Cross site scripting
- Insufficient authentication controls for end users (e.g. brute force detection, insufficient password complexity, etc.)
- Cross site request forgery (CSRF) as a means to forge transactions outlined in step 4
- Parameter manipulation as a means to perform privilege escalation outlined in step 4
- XML injection to downstream web services
- Insufficient authentication of SOAP web service calls between servers
TME Step 6: Risk Rank Vulnerabilities
Draw a chart with two axes. The X axis is impact and the Y axis is likelihood. This chart represents risk, where the top left corner is the highest risk and the lowest right corner is lowest risk. Now, as a group, look at each vulnerability you identified in step 5 and rank it according to risk. The scale of the axes is not important since you are simply ranking the vulnerabilities to one another. Naturally this stage can take a long time to debate, so make sure not to spend more than two minutes analyzing a specific vulnerability. Also remember to rank the risk as if there were no countermeasure in place. This is important: you are prioritizing potential vulnerabilities in your system, because at this point you cannot say with any certainty that a particular countermeasure already exists.
At the end of this step you will have a prioritized set of vulnerabilities to fix in your system, along with the set of business logic threats from step 4.
Case study step 6:
The discussion on risk ranking was heated. It was nearly impossible to find consensus on the risk ranking of many of the vulnerabilities. Meeting attendees tended to want to place the majority of threats on the top right quadrant of the chart. We reminded the attendees that calling everything high risk resulted in nothing being prioritized. Forced prioritization was difficult but important. Nearly everyone agreed that SQL injection was the highest risk vulnerability. Cross site scripting and parameter manipulation for privilege escalation generated the most controversy. Two of the developers were insistent that their application was not vulnerable to parameter manipulation privilege escalation. We reminded them the presence or absence of countermeasures was not relevant to this step of the process.
TME Step 7: Enumerate Countermeasures
The final step is to take each vulnerability in step 6 and determine its corresponding countermeasure. You will likely find the Common Weakness Enumeration useful for this purpose. Because the vulnerabilities were risk ranked in step 6, the corresponding list of countermeasures is naturally prioritized. You can now implement the countermeasures by risk, according to the threats that your business actually cares about.
Case study step 7:
This step was the most straightforward and generated the least debate. Having undergone many security audits in the past, the Eastcoast United developers generally knew what countermeasures to apply. For example:
- Always use stored procedures for database interaction with proper variable binding to prevent CSRF
- Perform contextual output encoding to prevent XSS
- Employ a multi-layer authorization scheme enforced on the server to prevent privilege escalation
These features were fed into the design of the new release and undoubtedly reduced, although did not eliminate, the number of vulnerabilities that appeared once the application was completed and security tested.
Threat Modeling Express is not for everyone. Decreased rigor & formality come at the expense of potentially missing important threats. The lightweight nature of analysis means it is well-suited for certain kinds of applications such as enterprise web applications written in managed programming languages, but not as well suited for software with more complex threats such as operating systems. Moreover, TME assumes that you have an application security Subject Matter Expert available for the four hour meeting. In some environments this may not be possible. These challenges may mean that TME is not the right approach for you and you should consider a different methodology such as Microsoft’s SDL or Octotrike.
Threat Modeling Express is a lightweight technique to prioritize your application security efforts. Based on a four hour meeting format, TME fills the gap between a complete absence of design security process and a comprehensive, formal approach. TME builds consensus from key stakeholders on risks that matter to the business, while capturing both domain agnostic and domain specific threats – the latter which often requires human analysis to discover.
We have facilitated TME sessions successfully at several clients in many industries. It has provided us with useful prioritization in further testing efforts while helping development teams to focus on secure development & design efforts.
TME is most successful in well-defined problem spaces such as web applications written in managed programming languages. TME is not as rigorous as other methodologies and it requires subject matter expertise in the security of the application you are building.
Ultimately, TME is best used where time-to-market pressures prohibit a more rigorous approach and in agile environments that eschew heavy processes.
- Open Software Assurance Maturity Model
- Building Security-In Maturity Model
- Microsoft’s Secure Development Lifecycle
About the Authors
Sahba Kazerooni, Director of Professional Services at Security Compass, is responsible for managing Security Compass’s internationally renowned consultants on cutting edge consulting and training engagements across North America and around the world. Sahba's skill set ranges from hands-on assessments in application penetration testing, threat modeling, and source code review, to security advisory and technical training. He has an advanced knowledge of the Software Development Life Cycle (SDLC) as well as the intricacies of the Java programming language. Sahba is also an internationally-renowned speaker on security topics. He has presented at conferences around the world; he delivers Java secure coding training at the SANS Institute; and he has also provided numerous presentations through ISC2 to their elite network of certified information security professionals. Sahba can be reached at email@example.com.
Rohit Sethi, VP Product at SD Elements, is a specialist in building security controls into the software development life cycle (SDLC). Rohit is a SANS course developer and instructor on Secure J2EE development. He has spoken and taught at FS-ISAC, RSA, OWASP, Shmoocon, CSI National, Sec Tor, Infosecurity New York and Toronto, TASK, the ISC2's Secure Leadership series conferences, and many others. Mr. Sethi has written articles for Dr. Dobb's Journal, TechTarget, Security Focus and the Web Application Security Consortium (WASC), and he has been quoted as an expert in application security for ITWorldCanada and Computer World. He also leads the OWASP Design Patterns Security Analysis project. Rohit can be reached at firstname.lastname@example.org.
[i] Note that we haven’t precisely defined the term "threat". In our collective experience we’ve heard and seen at least dozens of interpretations of the word "threat". Our view is that in the time it takes to research and settle on the meaning of the word, you could have completed a TME session. Understandably, some people don’t agree with this line of thinking and in that case you should probably come up with an internal definition and stick with it ahead of the meeting.
On the risk rating, it seems it is usually difficult because the participants do not have the right metrics and are more technically oriented, than from the business side.
It's not clear from the article how they worked around this, and whether they showed how to determine a risk rating, and whether they had enough information available (asset value, etc). Perhaps this is the role of another team, i.e. IT Risk, etc.
Re: Risk rating
In Threat Modeling Express we tend to take a "gut feel" approach to determining a risk rating rather than analyzing hard data such as asset value. There's definitely problems with this approach, and if you have the time and data available then you should definitely consider a more rigorous approach to analyzing risk.
The key in TME is to have a business representative at the table so that they can add their voice to the risk ranking process.
Re: Risk rating
From your explanation, and reading the article in-depth, including the presentation by Thomas Peltier, I think I'm understanding it better and now see that one of the guiding principles is to use readily available information, rather than putting a full stop on the project to gather metrics, etc.
It seems to be a very realistic and practical approach. Of course, it is easy to have a security policy that says "Thou Shalt Do A Risk Assessment", but without a data classification, and metrics, and use cases, this is no simple task.
Following the Deming PDCA model, I can see this would be a useful interim target state for companies wishing to go towards a more classical approach, but based on your experience and case studies, there are definite benefits to the approach such that, fitting the organization's objectives, it may indeed server as a target state. I wonder how many companies that think they are performing classical risk assessments are in reality performing an assessment closer to the FRAP model.