BT

Kolton Andrus on Gremlin’s Newly Announced SaaS Chaos Engineering Product and Running Game Days

| Podcast with Kolton Andrus Follow 0 Followers by Wesley Reisz Follow 10 Followers on Dec 22, 2017 |

A note to our readers: As per your request we have developed a set of features that allow you to reduce the noise, while not losing sight of anything that is important. Get email and web notifications by choosing the topics you are interested in.

Gremlin is a Software as a Service that lets you plan, control and undo Chaos engineering experiments built by engineers with experience from Netflix, AWS, Dropbox and others. In this podcast Wes talks to Kolton Andrus about the Gremlin product and architecture and related topics such as running Game Days.

Key Takeaways

  • Chaos Engineering is about thoughtful, planed experiments. An example might be a Game Day where we get a group of engineers in a room and whiteboard out what can wrong, and then we run experiments to test our assumptions such as what happens when your log files fill up the disk.
  • In general the advice is to minimise your blast radius; start with the smallest thing that you can do that will teach you something about your system - a single instance, a single container - once you’ve got faith in that scale up and repeat.
  • Gremlin, which is based on the idea of chaos engineering or “resilience as a service” is now available as a product, with early customers including Twillio, Expedia, Confluent and Remind. There are open source alternatives but it’s the first enterprise product to our knowledge.
  • Examples of the Gremlins you can inject include things that consume resources (for example memory, CPU, disk-space, IO overhead), things that change the state of the host or the VM (kill processes, time travel such as a leap second or a clock skew, rebooting hosts and containers), and things that change the network (we can’t resolve DNS, the service is slow and so on).
  • The Gremlin client is written in Rust. Rust does deterministic memory management, and failure is a first class citizen in Rust the signatures in Rust have a result. The service is written in Java running on AWS, and the web interface is written in Ember.

 

Show notes will follow shortly

Resources

More about our podcasts

You can keep up-to-date with the podcasts via our RSS feed, and they are available via SoundCloud and iTunes.  From this page you also have access to our recorded show notes.  They all have clickable links that will take you directly to that part of the audio.

Previous podcasts

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Sponsored Content

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT