At QCon London 2016 Peter Alvaro and Kolton Andrus shared lessons learned from a fruitful collaboration between academia and industry, which ultimately resulted in the creation of a novel method for automating failure injection testing at Netflix. Core learnings included: work backwards from what you know; meet in the middle; and adapt the theory to the reality.
At the microXchg 2016 conference, held in Berlin, Germany, Richard Rodger presented “Surviving Microservices”, a practical guide for developers wanting to keep their microservices architectures ‘healthy and performant’. Key topics discussed in the talk included the benefits of message-oriented systems, pattern matching with inter-service communication, dealing with failure, and Seneca.js.
Failure testing should be a critical part of running your microservices, Kolton Andrus stated in his presentation at the recent Microservices Practitioner Summit. Verifying that your services behave as you expect is something you should do to prevent outages.
InfoQ interviewed Stephen Carver about how bringing in procedures and rules often doesn't help to prevent problems, enabling communication between engineers working in different companies, taking learnings from failure to a next level to prevent similar problems, and what engineers can do if they want to influence decisions on developing and releasing products.
A coachRetreat is a "safe to fail" learning platform where participants can try different approaches to coaching. In a coachRetreat participants explore the way that people interact in a given situation and can learn to view a situation from different perspectives to improve their coaching skills. An interview with Oana Juncu, Elad Sofer and Yves Hanoulle.
Russ Olsen did the opening keynote titled "To the Moon" at the GOTO Berlin 2015 conference. InfoQ interviewed him about drawbacks of doing all the things at the same time to meet the deadline, learning from things that went wrong and from things that went right, how little things can kill you in software development, and how to focus and deal with details when doing complex work.
In innovation the mantra "fail fast" is often used to explain that people should quickly try out ideas and then learn from the things that fail to develop new products and services. Some people challenged the need for failure and have come up with alternative approaches for effective innovation.
Autonomy is one of the core guiding principles at Spotify. It enables employees to make decisions as close to the works that is being done as possible. At the Agile Greece Summit 2015 Kristian Lindwall and Cliff Hazell from Spotify explained why autonomy is at the heart of agility.
Based on their experience with arbitrarily shutting down servers or simulating the shutdown of an entire data center in production, Netflix has proposed a number of principles of chaos engineering.
Even with best intentions it can be challenging for people to follow up on actions that they agreed to do. They can start to doubt if they can do the actions and become afraid to fail. Several authors have recognized this and came up with suggestions for dealing with it and making change happen.
At QCon New York 2015, Kolton Andrus discussed Netflix’s Failure Injection Testing (FIT) platform, which allows the injection and monitoring of arbitrary failure scenarios to a targeted group of customers using the Netflix production web services. FIT allows Netflix to maintain an ‘antifragile’ programming culture, which results in the creation of systems that are resilient to failure.
InfoQ interviewed Andrea Provaglio about business models for execution, optimization and discovery, dealing with uncertainty and leveraging it to create business value, understanding both value and cost, growing a discovery mindset, and creating a culture where people have the courage to make mistakes and can learn from them.
Behavior Driven Development (BDD) uses examples, preferably in conversations, to illustrate behavior. A lot of people focus on the tools if they are doing BDD but having the conversations is more important than writing down conversations and automating them said to Liz Keogh. An exploration of using BDD to do experiments to deal with complex problems and do discoveries.
Oliver Hankeln shares the anti-patterns he found for handling failure in organizations: hiding mistakes, engaging in blame game, the arc of escalation and cowardice. He then suggests corrective actions for each of them.
In the closing keynote of the Agile Eastern Europe 2015 conference Yves Hanoulle did an experiment together with his son Joppe in pair presenting. InfoQ interviewed Joppe and Yves Hanoulle about doing experiments, checking the safety of the environment and ways to make it safer, learning from failure, and presenting in pairs at conferences.