BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Bloomberg Releases Open Source “PowerfulSeal” Kubernetes-Specific Chaos Testing Tool

Bloomberg Releases Open Source “PowerfulSeal” Kubernetes-Specific Chaos Testing Tool

This item in japanese

Bookmarks

At the recent KubeCon North America conference, in Austin, USA, Bloomberg presented their new open source "PowerfulSeal" tool, which enables chaos testing within Kubernetes clusters via the termination of targeted pods and underlying node infrastructure. The Kubernetes container orchestration platform is a popular choice for deploying (distributed) microservice-based applications, and practices from chaos engineering can assist with building resilient systems.

PowerfulSeal follows the Principles of Chaos Engineering, and is inspired by the infamous Netflix Chaos Monkey. The tool allows engineers to "break things on purpose" and observe any issues caused by the introduction of various failure modes. PowerfulSeal, written in Python, is currently Kubernetes-specific and only has "cloud drivers" for managing infrastructure failure for the OpenStack platform, although a Python AbstractDriver class has been specified in order to encourage the contribution of drivers for additional cloud platforms.

PowerfulSeal works in two modes -- interactive and autonomous:

  • Interactive mode is designed to allow an engineer to discover a cluster's components, and manually cause failure to see what happens. It operates on nodes, pods, deployments and namespaces.
  • Autonomous mode reads a policy file, which can contain any number of pod and node failure scenarios, and "breaks things" as specified. Each scenario describes a list of matches, filters and actions to execute on your cluster.

A minimal no-op JSON policy file (which causes no failures) is shown below -- failures can be specified within the 'nodeScenarios' and 'podScenarios' section of the JSON document:

config:
  minSecondsBetweenRuns: 47
  maxSecondsBetweenRuns: 452

nodeScenarios: []
podScenarios: []

Each scenario can consist of matches and filters -- target node names, ip addresses, Kubernetes namespaces and labels, times and dates -- and actions -- start, stop, and kill. A comprehensive JSON schema can be used to validate the policy files, and an example policy file listing most of the available options can be found within the project's tests.

PowerfulSeal can be installed via pip, and the command line tool is initialised and configured against a Kubernetes cluster as follows:

  • Point PowerfulSeal at the target Kubernetes cluster by giving it a Kubernetes config file
  • Point PowerfulSeal at the underlying cloud IaaS platform by specifying the appropriate cloud driver and credentials
  • Ensure that PowerfulSeal can SSH into the nodes in order to execute commands
  • Write the required policies files and load these into PowerfulSeal

The topics of chaos and resilience engineering have received increased interest over the past year, and the first commercial tools in this space are emerging, for example, Gremlin. Several thought leaders within this space -- such as John Alspaw, co-founder at Adaptive Capacity Labs -- are cautioning that the human side of resilience engineering should not be forgotten, and is in fact more important than the associated tooling.

Kolton Andrus, CEO of Gremlin Inc, also stated that tooling alone is not sufficient, and argued for the need to train engineering teams and run "game days" in order to drill engineers in how to respond to failure (Andrus provided more context in a recent InfoQ podcast). Nora Jones, senior chaos engineer at Netflix, has also shared her thoughts on how to establish and mature chaos engineering practice in a recent InfoQ podcast.

More information and an interactive demo on PowerfulSeal can be found in the project's GitHub README, and the video of the "Testing Distributed Software on Kubernetes with PowerfulSeal" KubeCon can be found on the CNCF YouTube channel.

Rate this Article

Adoption
Style

BT