BT

Building 'Failure as a Service' at Netflix without the Simian Army

by Daniel Bryant on  Jun 13, 2015

At QCon New York 2015, Kolton Andrus discussed Netflix’s Failure Injection Testing (FIT) platform, which allows the injection and monitoring of arbitrary failure scenarios to a targeted group of customers using the Netflix production web services. FIT allows Netflix to maintain an ‘antifragile’ programming culture, which results in the creation of systems that are resilient to failure.

Uncertainty in Agile and the Discovery Mindset

by Ben Linders on  Jun 11, 2015

InfoQ interviewed Andrea Provaglio about business models for execution, optimization and discovery, dealing with uncertainty and leveraging it to create business value, understanding both value and cost, growing a discovery mindset, and creating a culture where people have the courage to make mistakes and can learn from them.

Experiment using Behavior Driven Development

by Ben Linders on  Apr 16, 2015

Behavior Driven Development (BDD) uses examples, preferably in conversations, to illustrate behavior. A lot of people focus on the tools if they are doing BDD but having the conversations is more important than writing down conversations and automating them said to Liz Keogh. An exploration of using BDD to do experiments to deal with complex problems and do discoveries.

Anti-patterns for Handling Failure

by Manuel Pais on  Apr 04, 2015

Oliver Hankeln shares the anti-patterns he found for handling failure in organizations: hiding mistakes, engaging in blame game, the arc of escalation and cowardice. He then suggests corrective actions for each of them.

Using Pairing for Experimenting in Presentations

by Ben Linders on  Mar 28, 2015

In the closing keynote of the Agile Eastern Europe 2015 conference Yves Hanoulle did an experiment together with his son Joppe in pair presenting. InfoQ interviewed Joppe and Yves Hanoulle about doing experiments, checking the safety of the environment and ways to make it safer, learning from failure, and presenting in pairs at conferences.

Failure Injection Testing: Controlling Failure in Production

by Michael Stiefel on  Dec 12, 2014 2

Netflix's Failure Injection Testing bridges the gap between isolated testing and unmitigated chaos testing by controlling the impact of the test. FIT establishes a context which other components of Netflix's production testing and infrastructure systems interpret and adjust the behavior of the system accordingly.

Mindfulness and Situational Awareness in Organizations

by Ben Linders on  Nov 12, 2014 1

To thoroughly remove waste in a process you need flow to deliver just in time, and mindfulness and situational awareness in organizations to handle problems with processes and built in human intelligence. Organizations apply concepts from flow to develop what is needed and when it is needed and use pull to prevent inventories. What they also need is “Jidoka”: mindfulness and situational awareness.

How Netflix Handled the Reboot of 218 Cassandra Nodes

by Abel Avram on  Oct 28, 2014

Amazon performed a major maintenance update at the end of September in order to patch a security vulnerability in a Xen hypervisor affecting about 10% of their global fleet of cloud servers. This update involved the rebooting of those servers, with consequences for AWS users and the services they provide, including one of their largest clients, Netflix.

Avoidance of Organizational Dysfunction Leads to Scrum Masters' Failure

by Savita Pahuja on  Oct 17, 2014 2

Bob Marshall explains the reason of failing of scrum master in most of the organizations as the lack of awareness on the part of adopting scrum and scrum master’s responsibility to tackle organizational dysfunction.

Leslie Lamport on Distributed Systems and Precise Thinking

by Sergio De Simone on  Oct 16, 2014

Leslie Lamport is the author of some of the most cited computer science papers and won a Turing Award in 2013 for his seminal work in distributed and concurrent systems. This is a summary of an interview that Lamport gave to Software Engineering Radio touching themes such as his early work in distributed systems and the importance of precise thinking in programming.

Fail Fast Means Learn Fast

by Rui Miguel Ferreira on  Jul 04, 2014

Failing fast and often is one of the encouraged practices for agile teams. Sander Hoogendoorn, author of the This is Agile book discusses on his blog the importance of having a strategy that helps you on the decision of aborting a project by assuming its failure on an early stage.

Working with Investors as a Lean Startup

by Ben Linders on  Mar 13, 2014

Entrepreneurs using lean startup can work with investors to raise capital for their business. Business plans from lean startups often differ from traditional startups and lean startup encourages learning from failure and to pivot, which might scare off investors. Can entrepreneurs and investors together use the lean startup approach to do fundraising?

Attitudes for Sustainable Lean Startup Teams

by Ben Linders on  Mar 11, 2014

Ramli John gave an ignite talk about the minimum viable attitudes for lean startup teams at the 2013 lean startup conference. According to Ramli there are three attitudes that help teams to run lean sustainable over time: humbleness, hunger and happiness.

How Can You Learn Early and Fast?

by Ben Linders on  Dec 26, 2013

Agile suggest that teams should fail-fast to enable quick learning from mistakes. Learning from failure is one approach, you can also learn early and fast from successes, by doing experimentation, or by using a plan for knowledge acquisition.

Testing Resiliency at PagerDuty Without a Simian Army

by Manuel Pais on  Nov 12, 2013

Doug Barth, from PagerDuty, talked at DevOps Days London about their approach to start resiliency testing their systems without dedicating a lot of automation effort upfront. The goal was to quickly start learning about failure points and openly discuss how to fix them with only one hour per week of effort.

General Feedback
Bugs
Advertising
Editorial
Marketing
InfoQ.com and all content copyright © 2006-2015 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT