x Take the InfoQ Survey !
Older Newer rss

Failure: The Good Parts

Posted by Viktor Klang  on  May 01, 2014

Viktor Klang keynotes on the imminence and the need to prepare for failure along with several ways of managing failure in case it happens.


Evolving Culture and Values. Understanding the Tradeoffs. Growth through Failure. The Importance of Leadership and Open Communication.

Posted by Pedram Keyani  on  Mar 11, 2014

Pedram Keyani discusses the importance of evolving the culture and values of an organization, dealing with tradeoffs, learning from failure, proper leadership and open communication.


Running an Agile Transformation using Lean Startup

Posted by Jason Little  on  Feb 01, 2014

Jason Little discusses how to avoid an organizational change failure when introducing Agile by leveraging principles of Lean Startup and Customer Development.


How Netflix Architects for Survival

Posted by Jeremy Edberg  on  Nov 29, 2013

Jeremy Edberg discusses how Netflix designs their systems in order to survive outages, network latency and random instance failure.


Resiliency through Failure - Netflix's Approach to Extreme Availability in the Cloud

Posted by Ariel Tseitlin  on  Sep 22, 2013 1

Ariel Tseitlin discusses Netflix' failure-based suite of tools, collectively called the Simian Army, used to improve resiliency and maintain the cloud environment.

Keynote: System, Heal Thyself

Posted by Mike Andrews  on  Oct 03, 2012

Mike Andrews discusses architecting for failure even you when don’t know what might fail.

Entirely Predictable Failures

Posted by Poul-Henning Kamp  on  Sep 26, 2012 1

Poul-Henning Kamp considers that if developers are not getting better, we are going to repeat many of the major IT project failures. He exemplifies with major Denmark project failures.

Architecting for Failure at the

Posted by Michael Brunton-Spall  on  Apr 25, 2012

Michael Brunton-Spall talks about various types of system failure that can happen, sharing the lessons learned at the Guardian and measures taken to prevent and mitigate failure.

Resilient Response In Complex Systems

Posted by John Allspaw  on  Apr 19, 2012

John Allspaw discusses pitfalls to be avoided while troubleshooting failed systems, comparing web operations at scale with practices in aviation and nuclear power industries.

On Distributed Failures (and handling them with Doozer)

Posted by Blake Mizerany  on  Dec 27, 2011 1

Blake Mizerany presents various ways that can lead to system failure in distributed systems and how to recover using Doozer, a highly available, consistent data store.

Things Break, Riak Bends

Posted by Justin Sheehy  on  Aug 09, 2011

Justin Sheehy talks about failure and the need to prepare for it, giving some real life examples along with techniques implemented in Riak to make it resilient to faults.

Everything I've Ever Learned, I Learned from Failure

Posted by Robert Myers  on  Apr 07, 2011 1

Robert Myers talks about the role played by failure in Agile development, sharing a number of Lean and Agile practices helping to embrace failure and showing how to interpret the feedback received.

General Feedback
Marketing and all content copyright © 2006-2015 C4Media Inc. hosted at Contegix, the best ISP we've ever worked with.
Privacy policy