Chaos Monkey 2.0 Runs via Spinnaker
Netflix has recently made available the source code of the Chaos Monkey 2.0. The latest iteration of the resilience tool is fully integrated with Spinnaker and event tracking systems, but the SSH support has been removed.
Chaos Monkey 2.0 is now configured and run through Spinnaker, a continuous delivery platform open sourced by Netflix. This integration makes it possible to access instances in AWS, Google Cloud, Microsoft Azure and Cloud Foundry. Spinnaker provides information on how services are deployed across datacenters, and Chaos Monkey uses it to terminate instances as scheduled.
Netflix configures the Chaos Monkey to report instance terminations to Atlas and Chronos, enabling them to track and visualize the frequency of instance termination. The tool can be configured with other telemetry and event tracking systems.
The SSH capability which allowed one to connect to an instance and tweak the CPU consumption or take a disk down has been removed from the tool. These failure modes were considered too “insidious” to be randomly applied and another way was developed for them.
The Chaos Monkey is the smaller brother of Gorilla and Kong, a pair of resilience tools used by Netflix to simulate the malfunction of AWS availability zones or entire regions. Chaos Monkey is used instead to terminate individual instances in a datacenter or across multiple regions. The tool has been used by Netflix to force their software engineers to design and implement systems with resilience in mind. Chaos Monkey needs Spinnaker and MySQL to run. It is written in Go, but it does not run as a service. Instead it is triggered by a