BT
Older rss
50:29

Evolving Culture and Values. Understanding the Tradeoffs. Growth through Failure. The Importance of Leadership and Open Communication.

Posted by Pedram Keyani  on  Mar 11, 2014

Pedram Keyani discusses the importance of evolving the culture and values of an organization, dealing with tradeoffs, learning from failure, proper leadership and open communication.

30:21

Running an Agile Transformation using Lean Startup

Posted by Jason Little  on  Feb 01, 2014

Jason Little discusses how to avoid an organizational change failure when introducing Agile by leveraging principles of Lean Startup and Customer Development.

49:18

How Netflix Architects for Survival

Posted by Jeremy Edberg  on  Nov 29, 2013

Jeremy Edberg discusses how Netflix designs their systems and deployment processes to help the service survive both catastrophic events like zone and regional outages and less catastrophic events like network latency and random instance death.

48:52

Resiliency through Failure - Netflix's Approach to Extreme Availability in the Cloud

Posted by Ariel Tseitlin  on  Sep 22, 2013 1

Ariel Tseitlin discusses Netflix' suite of tools, collectively called the Simian Army, used to improve resiliency and maintain the cloud environment. The tools simulate failure in order to see how the system reacts to it.

Keynote: System, Heal Thyself

Posted by Mike Andrews  on  Oct 03, 2012

Mike Andrews discusses architecting for failure even you when don’t know what might fail.

Entirely Predictable Failures

Posted by Poul-Henning Kamp  on  Sep 26, 2012 1

Poul-Henning Kamp considers that if developers are not getting better, we are going to repeat many of the major IT project failures. He exemplifies with major Denmark project failures.

Architecting for Failure at the Guardian.co.uk

Posted by Michael Brunton-Spall  on  Apr 25, 2012

Michael Brunton-Spall talks about various types of system failure that can happen, sharing the lessons learned at the Guardian and measures taken to prevent and mitigate failure.

Resilient Response In Complex Systems

Posted by John Allspaw  on  Apr 19, 2012

John Allspaw discusses pitfalls to be avoided while troubleshooting failed systems, comparing web operations at scale with practices in aviation and nuclear power industries.

On Distributed Failures (and handling them with Doozer)

Posted by Blake Mizerany  on  Dec 27, 2011 1

Blake Mizerany presents various ways that can lead to system failure in distributed systems and how to recover using Doozer, a highly available, consistent data store.

Things Break, Riak Bends

Posted by Justin Sheehy  on  Aug 09, 2011

Justin Sheehy talks about failure and the need to prepare for it, giving some real life examples along with techniques implemented in Riak to make it resilient to faults.

Everything I've Ever Learned, I Learned from Failure

Posted by Robert Myers  on  Apr 07, 2011 1

Robert Myers talks about the role played by failure in Agile development, sharing a number of Lean and Agile practices helping to embrace failure and showing how to interpret the feedback received.

Failures and Successes with Reuse

Posted by Herbjörn Wilhelmsen  on  Mar 23, 2011 4

Herbjörn Wilhelmsen discusses the reasons why an SOA project failed while trying to reuse existing resources, and how it succeeded later starting from the same business case with reuse in mind.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT