InfoQ Homepage Chaos Engineering Content on InfoQ
-
Werner Vogels on “21st Century [Cloud] Architectures”: Availability, Reliability and Resilience
At the AWS re:invent 2017 conference, Werner Vogels, CTO of Amazon, presented a keynote that discussed core concepts required for building “21st Century Architectures” on the cloud. Highlights of the talk included discussion of the emerging practices of evolutionary and “cloud native” architectures, the role of security becoming everyone’s responsibility, and the benefits of chaos engineering.
-
Expedia's Journey toward Site Resiliency: Embracing Chaos Testing in Dev and Production at QCon SF
At QCon SF, Sahar Samiei and Willie Wheeler presented “Expedia’s Journey Toward Site Resiliency”, and discussed the building of a community of practice around resilience testing within Expedia. The results have generally been positive: Netflix’s Chaos Monkey has been running daily in production since May 15th; and resilience tests have been added to four Tier 1 service pipelines.
-
Adrian Cockcroft Discusses Chaos Architecture: "Four Layers, Two Teams, and an Attitude"
At QCon San Francisco, Adrian Cockcroft presented “Chaos Architecture”, and discussed the evolution of cloud native architecture, and how chaos engineering can be applied to produce better and safer systems. Effective chaos architecture and engineering was presented as consisting of “four layers, two teams, and an attitude”.
-
Designing Services for Resilience: Nora Jones Discusses Netflix Chaos Engineering at QCon SF
At QCon SF Nora Jones presented “Designing Services for Resilience Experiments: Lessons from Netflix”. Key takeaways from the talk included: the customer experience is a priority; designing for resiliency testability is a shared responsibility; configuration changes can cause outages; and engineers should have have explicit monitoring in place to detect antipatterns in configuration changes.
-
Choose Your Own Adventure: Chaos Engineering at QCon New York 2017
Nora Jones, senior chaos engineer at Netflix, talked about chaos engineering at QCon New York 2017. She presents different stages of chaos engineering adoption and gives stories from her previous experiences at Jet and Netflix.
-
Netflix Engineer Lorin Hochstein on Chaos Monkey 2.0
Netflix made waves when it initially announced Chaos Monkey, a tool that would terminate normally healthy VM instances in production. The goal was to embrace failure and thereby increase resiliency. Rags Srinivas caught up with Lorin Hochstein at Netflix regarding the recent upgrade to Chaos Monkey.
-
Chaos Monkey 2.0 Runs via Spinnaker
Netflix has recently made available the source code of the Chaos Monkey 2.0. The latest iteration of the resilience tool is fully integrated with Spinnaker and event tracking systems, but the SSH support has been removed.