Interview and Book Excerpt: CERT Resilience Management Model
The CERT Resilience Management Model (CERT-RMM), developed by the CERT Program at the Software Engineering Institute (SEI), defines the processes for managing operational resilience in complex, risk-evolving environments. It's used for managing and improving operational resilience. It also converges key operational risk management activities - security, BC/DR, and IT operations - and defines maturity through capability levels (like CMMI).
CERT-RMM contains 26 process areas in 4 categories: Engineering, Operations Management, Enterprise Management, and Process Management.
InfoQ spoke with Rich Caralli, Technical Manager of the CERT Resilient Enterprise Management Team, about CERT-RMM and the CERT Resilience Management Model book co-authored by him. We are also making an excerpt from the book (Chapter 6: Using CERT-RMM) available for our readers.
InfoQ: What was the main motivation for creating the CERT Resilience Management Model (CERT-RMM)?
Rich Caralli: The model evolved from field work my team did a number of years ago with information security risk assessment methods. As we worked with customers to improve their risk assessment and mitigation capabilities, we observed that they could make temporal, locally optimized progress at the operational unit level but lacked success in having long-term, organization-level impact. We found that much of this could be attributed to the insufficiency of organization-level security processes and risk management activities. In other words, we found little (if any) support for security as an enterprise-wide process, with the result that organizations were unable to sustain and build on localized successes.
Another big detractor we saw was how siloed the domains of security, business continuity, and IT operations management usually were, where they should have a joint mission of supporting operational resilience. So our work on the model focused on providing a tool for converging those domains, connecting them with organizational drivers, and institutionalizing them through organizational processes that can be actively controlled, measured, and continuously improved.
InfoQ: What is the current state of the CERT-RMM project?
Rich: We're applying continuous improvement to the model itself and are currently preparing updates for version 1.2. And regarding the Addison-Wesley book version of the model, we just received a request to have it translated into Portuguese.
We have teams conducting CERT-RMM-based appraisals at various organizations. We've also developed a CERT-RMM-based lightweight assessment instrument called Compass that can be used to quickly identify areas for improvement or set direction for more formal appraisals. Customized derivatives of Compass are being used in some significant assessment initiatives.
There are five public offerings of the Introduction to the CERT Resilience Management Model course scheduled in 2011, one of which will be in London in October. The course is also available by special arrangement at customer sites. Licensing and certification activities necessary to create a CERT-RMM partner network and qualified and certified appraisers and instructors are underway.
Representatives from four large organizations have started meeting quarterly in our CERT-RMM User's Group. Their input will influence future enhancements to the model.
InfoQ: How can the model be used in organizations that are using Agile and Lean methodologies like Scrum, XP or Kanban in software development and operational areas?
Rich: The model is designed for flexible use. Implementation of it can be scoped down to the practice level. Use Compass to quickly identify "pain points," prioritize one or two improvement needs based on the findings, and implement just the practices in the model related to those needs. As a project progresses, another Compass self-assessment can help in evaluating progress and setting new priorities for improvement.
InfoQ: You use the term Key Risk Indicators (KRIs) in the book, that provide risk thresholds for an organization's risk tolerance. Can you discuss KRIs in more detail and give some examples of these metrics?
Rich: Key risk indicators (KRIs) are organizationally specific thresholds that, when crossed, indicate levels of risk that may exceed the organization's risk tolerance. In the model, their use is recommended in governance dashboards or scorecards for measuring and managing the performance of the operational resilience management system. If an indicator shows that a threshold has been crossed, some adjustment in the system or other action to mitigate the risk might be needed.
For example, a KRI of "200 users affected" might be established for virus intrusion incidents. As incidents are monitored, if the virus intrusion KRI is exceeded, it triggers a notification to appropriate staff to act to prevent operational disruption. If KRI data is collected over time, a related measure reported could be the percentage of incidents that caused damage, compromise, or loss to the organization's assets beyond established thresholds.
InfoQ: What is the future road map of CERT-RMM?
Rich: New CERT-RMM courses are planned, including the CERT-RMM Appraisal Boot Camp, which will provide training in conducting CERT-RMM appraisals. We'll also be developing a course to train instructors for the Introduction to the CERT Resilience Management Model course. To our certification path for CERT-RMM lead appraisers we'll be adding paths for two new roles, CERT-RMM Navigators and Coaches.
CERT-RMM is currently available in the Addison-Wesley book version and in PDFs that can be downloaded from the cert.org website. We plan to automate delivery of the model to enable not just easy navigation but also custom views, links to related resources, tagging, queries, and other features.
As an output of ongoing resilience measurement and analysis research, we'll be publishing an addendum to the model this summer that will suggest a "top ten" strategic measures for getting started in operational resilience measurement and will provide new and expanded example measures for all process areas. Future work includes development of a report on using process definitions as a framework for defining measures.
And continuous improvement of the model will continue. We've just started drafting a streamlined model architecture, which will appear along with some other major improvements in version 2.0.
About the Book Author
Richard Caralli is the Technical Manager of the Resilient Enterprise Management Team in the CERT Program at the Carnegie Mellon Software Engineering Institute (SEI). Richard's areas of interest include information assurance risk management, critical infrastructure protection, resilience process improvement, and resilience measurement and analysis. In addition to being the lead architect of the CERT Resilience Management Model, Richard has developed several information assurance risk assessment methods at the SEI, including OCTAVE Allegro method, and has taught extensively on information security management topics.
Prior to joining the SEI, Richard spent more than 25 years in information technology positions in industry, primarily in IT auditing. Richard received his bachelor’s degree in Accounting from St. Vincent's College and an MBA from Duquesne University.