InfoQ Homepage Resilience Content on InfoQ
-
Designing & Managing for Resilience
The fourth article in a series on how software companies adapted and continue to adapt to enhance their resilience explores the strategies used by engineering leaders to help create the conditions for sustained resilience. It provides stories, examples, and strategies towards designing an organizational structure to support resilient performance and managing for resilience.
-
Adaptive Frontline Incident Response: Human-Centered Incident Management
The third article in a series on how software companies adapted and continue to adapt to enhance their resilience zeros in on the sources that comprise most of your company’s adaptive resources: your frontline responders. In this article, we draw on our experiences as incident commanders with Twilio to share our reflections on what it means to cultivate resilient people.
-
Learning from Incidents
Jessica DeVita (Netflix) and Nick Stenning (Microsoft) have been working on improving how software teams learn from incidents in production. In this article, they share some of what they’ve learned from the research community in this area, and offer some advice on the practical application of this work.
-
Meeting the Challenges of Disrupted Operations: Sustained Adaptability for Organizational Resilience
The first article in a series on how software companies adapted and continue to adapt to enhance their resilience starts by laying a foundation for thinking about organizational resilience. It looks at what organizations can do structurally during surprising and disruptive events to establish conditions that help engineering teams adapt in practice and in real time as disruptive events occur.
-
InfoQ 2020 Recap, Editor Recommendations, and Best Content of the Year
As 2020 is coming to an end, we created this article listing some of the best posts published this year. This collection was hand-picked by nine InfoQ Editors recommending the greatest posts in their domain. It's a great piece to make sure you don't miss out on some of the InfoQ's best content.
-
SeaMonkeys - Chaos in the War Room
Glen Ford describes his experience applying a very early form of chaos testing to naval combat systems in the Australian military in the late 1990s and draws the parallels to modern SRE.
-
The Abyss of Ignorable: a Route into Chaos Testing from Starling Bank
Greg Hawkins describes how Starling Bank introduced a chaos engineering practice, starting in 2016 with their own simple chaos daemon.
-
Applying Chaos Engineering in Healthcare: Getting Started with Sensitive Workloads
Carl Chesser shares what the teams at Cerner Corporation, a healthcare information technology company, found to be effective in introducing chaos engineering with their systems.
-
Failover Conf Q&A on Building Reliable Systems: People, Process, and Practice
One of the biggest engineering challenges associated with maintaining or increasing the reliability of a system is knowing where to invest time and energy. InfoQ recently sat down with several engineers and technical leaders who are involved with the upcoming Failover Conf virtual event, and asked their opinion on the best practices for building and running reliable systems.
-
The Fundamental Truth behind Successful Development Practices: Software is Synthetic
Software systems are creative compounds, emergent and generative; the product of complex interactions between people and technology. They are different from the orderly, analytic worlds that our school-age selves expect to find. Being so full of complexity and uncertainty, we use a different way to arrive at a solution.
-
InfoQ Editors' Recommended Talks from 2019
As part of the 2019 end-of-year-summary content, this article collects together a list of recommended presentation recordings from the InfoQ editorial team.
-
SLOs Are the API for Your Engineering Team
SLOs provide a simple common language for evaluating risk in terms of error budgets. SLOs save everyone involved both time and energy, which you can redirect toward more important things, like keeping your customers happy.