At QCon New York 2015, Kolton Andrus discussed Netflix’s Failure Injection Testing (FIT) platform, which allows the injection and monitoring of arbitrary failure scenarios to a targeted group of customers using the Netflix production web services. FIT allows Netflix to maintain an ‘antifragile’ programming culture, which results in the creation of systems that are resilient to failure.
InfoQ interviewed Andrea Provaglio about business models for execution, optimization and discovery, dealing with uncertainty and leveraging it to create business value, understanding both value and cost, growing a discovery mindset, and creating a culture where people have the courage to make mistakes and can learn from them.
Behavior Driven Development (BDD) uses examples, preferably in conversations, to illustrate behavior. A lot of people focus on the tools if they are doing BDD but having the conversations is more important than writing down conversations and automating them said to Liz Keogh. An exploration of using BDD to do experiments to deal with complex problems and do discoveries.
Oliver Hankeln shares the anti-patterns he found for handling failure in organizations: hiding mistakes, engaging in blame game, the arc of escalation and cowardice. He then suggests corrective actions for each of them.
In the closing keynote of the Agile Eastern Europe 2015 conference Yves Hanoulle did an experiment together with his son Joppe in pair presenting. InfoQ interviewed Joppe and Yves Hanoulle about doing experiments, checking the safety of the environment and ways to make it safer, learning from failure, and presenting in pairs at conferences.
Netflix's Failure Injection Testing bridges the gap between isolated testing and unmitigated chaos testing by controlling the impact of the test. FIT establishes a context which other components of Netflix's production testing and infrastructure systems interpret and adjust the behavior of the system accordingly.
To thoroughly remove waste in a process you need flow to deliver just in time, and mindfulness and situational awareness in organizations to handle problems with processes and built in human intelligence. Organizations apply concepts from flow to develop what is needed and when it is needed and use pull to prevent inventories. What they also need is “Jidoka”: mindfulness and situational awareness.
Amazon performed a major maintenance update at the end of September in order to patch a security vulnerability in a Xen hypervisor affecting about 10% of their global fleet of cloud servers. This update involved the rebooting of those servers, with consequences for AWS users and the services they provide, including one of their largest clients, Netflix.
Bob Marshall explains the reason of failing of scrum master in most of the organizations as the lack of awareness on the part of adopting scrum and scrum master’s responsibility to tackle organizational dysfunction.
Leslie Lamport is the author of some of the most cited computer science papers and won a Turing Award in 2013 for his seminal work in distributed and concurrent systems. This is a summary of an interview that Lamport gave to Software Engineering Radio touching themes such as his early work in distributed systems and the importance of precise thinking in programming.
Failing fast and often is one of the encouraged practices for agile teams. Sander Hoogendoorn, author of the This is Agile book discusses on his blog the importance of having a strategy that helps you on the decision of aborting a project by assuming its failure on an early stage.
Entrepreneurs using lean startup can work with investors to raise capital for their business. Business plans from lean startups often differ from traditional startups and lean startup encourages learning from failure and to pivot, which might scare off investors. Can entrepreneurs and investors together use the lean startup approach to do fundraising?
Ramli John gave an ignite talk about the minimum viable attitudes for lean startup teams at the 2013 lean startup conference. According to Ramli there are three attitudes that help teams to run lean sustainable over time: humbleness, hunger and happiness.
Agile suggest that teams should fail-fast to enable quick learning from mistakes. Learning from failure is one approach, you can also learn early and fast from successes, by doing experimentation, or by using a plan for knowledge acquisition.
Doug Barth, from PagerDuty, talked at DevOps Days London about their approach to start resiliency testing their systems without dedicating a lot of automation effort upfront. The goal was to quickly start learning about failure points and openly discuss how to fix them with only one hour per week of effort.