BT

Bloomberg Releases Open Source “PowerfulSeal” Kubernetes-Specific Chaos Testing Tool

| by Daniel Bryant Follow 445 Followers on Jan 25, 2018. Estimated reading time: 2 minutes |

A note to our readers: You asked so we have developed a set of features that allow you to reduce the noise: you can get email and web notifications for topics you are interested in. Learn more about our new features.

At the recent KubeCon North America conference, in Austin, USA, Bloomberg presented their new open source "PowerfulSeal" tool, which enables chaos testing within Kubernetes clusters via the termination of targeted pods and underlying node infrastructure. The Kubernetes container orchestration platform is a popular choice for deploying (distributed) microservice-based applications, and practices from chaos engineering can assist with building resilient systems.

PowerfulSeal follows the Principles of Chaos Engineering, and is inspired by the infamous Netflix Chaos Monkey. The tool allows engineers to "break things on purpose" and observe any issues caused by the introduction of various failure modes. PowerfulSeal, written in Python, is currently Kubernetes-specific and only has "cloud drivers" for managing infrastructure failure for the OpenStack platform, although a Python AbstractDriver class has been specified in order to encourage the contribution of drivers for additional cloud platforms.

PowerfulSeal works in two modes -- interactive and autonomous:

  • Interactive mode is designed to allow an engineer to discover a cluster's components, and manually cause failure to see what happens. It operates on nodes, pods, deployments and namespaces.
  • Autonomous mode reads a policy file, which can contain any number of pod and node failure scenarios, and "breaks things" as specified. Each scenario describes a list of matches, filters and actions to execute on your cluster.

A minimal no-op JSON policy file (which causes no failures) is shown below -- failures can be specified within the 'nodeScenarios' and 'podScenarios' section of the JSON document:

config:
  minSecondsBetweenRuns: 47
  maxSecondsBetweenRuns: 452

nodeScenarios: []
podScenarios: []

Each scenario can consist of matches and filters -- target node names, ip addresses, Kubernetes namespaces and labels, times and dates -- and actions -- start, stop, and kill. A comprehensive JSON schema can be used to validate the policy files, and an example policy file listing most of the available options can be found within the project's tests.

PowerfulSeal can be installed via pip, and the command line tool is initialised and configured against a Kubernetes cluster as follows:

  • Point PowerfulSeal at the target Kubernetes cluster by giving it a Kubernetes config file
  • Point PowerfulSeal at the underlying cloud IaaS platform by specifying the appropriate cloud driver and credentials
  • Ensure that PowerfulSeal can SSH into the nodes in order to execute commands
  • Write the required policies files and load these into PowerfulSeal

The topics of chaos and resilience engineering have received increased interest over the past year, and the first commercial tools in this space are emerging, for example, Gremlin. Several thought leaders within this space -- such as John Alspaw, co-founder at Adaptive Capacity Labs -- are cautioning that the human side of resilience engineering should not be forgotten, and is in fact more important than the associated tooling.

Kolton Andrus, CEO of Gremlin Inc, also stated that tooling alone is not sufficient, and argued for the need to train engineering teams and run "game days" in order to drill engineers in how to respond to failure (Andrus provided more context in a recent InfoQ podcast). Nora Jones, senior chaos engineer at Netflix, has also shared her thoughts on how to establish and mature chaos engineering practice in a recent InfoQ podcast.

More information and an interactive demo on PowerfulSeal can be found in the project's GitHub README, and the video of the "Testing Distributed Software on Kubernetes with PowerfulSeal" KubeCon can be found on the CNCF YouTube channel.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT