BT

InfoQ Homepage Guides The InfoQ eMag: Chaos Engineering

The InfoQ eMag: Chaos Engineering

Bookmarks

As our systems scale, we need more complexity, which inherently increases exponentially over time. The need for understanding and navigating this complexity also increases. Chaos engineering is a discipline that allows us to refine, recalibrate, and navigate the understanding of our systems through intentional and careful experimentation in the form of failure injection. This greater understanding ultimately leads to a better experience for our customers and better outcomes for our businesses.

At Netflix, we’ve been embracing chaos engineering since Chaos Monkey was born in 2011. It has gone through several iterations and tools that eventually evolved into the Failure Injection Testing (FIT) platform and, ultimately, ChAP (a platform for safely automating and running chaos experiments in production) through the efforts of many amazing engineers. We’ve taken the opportunity to outline why this has been so beneficial for the business in a separate IEEE article titled “The Business Case for Chaos Engineering” and a free e-book from O’Reilly here.

One thing I’ve noticed in my experiences with chaos engineering at various companies is that each approaches it differently based upon the key business objectives, the architectural decisions, and behaviors and motivations of the people that make up the organization.

I hope that you enjoy the eMag we have created together and that it inspires you to dig deeper into your systems, question your mental models, and use chaos engineering to build confidence in your system’s behaviors under turbulent conditions. Happy reading!

Free download

Before you download this book...

This InfoQ eMag is sponsored by Gremlin. Waiting for a major outage to occur isn’t an option. Run proactive Chaos Experiments to verify that your system can withstand failure—and to fix it if it doesn’t.

Note: By checking the box you grant InfoQ permission to share your contact info with this sponsor.

Please choose

To receive this eMag please answer the following questions:

The InfoQ eMag: Chaos Engineering includes:

  • LinkedIn’s Waterbear: Influencing resilient applications - Michael Kehoe describes LinkedIn’s per-application resilience engineering effort called Project Waterbear and the corresponding suite of tools they built for running chaos-engineering experiments
  • UIs: Value of the Visual in Chaos Engineering - Patrick Higgins explores the importance of UI for chaos engineering tools, both as a teaching mechanism, and also as a way to provide engineers with strong safety mechanisms to allow engineers to, for example, halt experiments that might get out of hand.
  • Using Chaos Engineering to Secure Distributed Systems - Aaron Rinehart explores how chaos engineering can be applied to security testing in distributed systems, arguing that it it differs from both red/purple-team security testing and penetration testing in its goals, purpose, and methodology.
  • Recalibrating Mental Models Through Design of Chaos Experiments - John Allspaw explores how designing and running chaos experiments challenges integrant assumptions made by software engineers, providing a mechanism for the re-calibration of mental model the engineers have built up about two the system works.

InfoQ eMags are professionally designed, downloadable collections of popular InfoQ content - articles, interviews, presentations, and research - covering the latest software development technologies, trends, and topics.

BT

Is your profile up-to-date? Please take a moment to review and update.

Note: If updating/changing your email, a validation request will be sent

Company name:
Company role:
Company size:
Country/Zone:
State/Province/Region:
You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.