InfoQ

InfoQ

Topic/Tag specific view

Failure Content on InfoQ


Latest featured content about Failure

Architecting for Failure at the Guardian.co.uk

Topics
QCon London 2012,
QCon,
Operations,
Fault Tolerance,
Conferences,
Infrastructure,
Failure,
Architecture,
Website

Michael Brunton-Spall talks about various types of system failure that can happen, sharing the lessons learned at the Guardian and measures taken to prevent and mitigate failure.

News about Failure

Adopting Agile in an Environment of Fear

Topics
Adopting Agile,
Agile in the Enterprise,
Success,
Agile,
Failure

Agile adoption and transformation is sometimes effective, and sometimes not. Is there a common thread to the failures? Does fear have anything to do with it? And what can we expect if we start an agile adoption initiative in an environment that is full of fear?

All Right It Failed, What Next?

Topics
Agile Techniques,
Agile,
Failure,
Retrospectives

Usually failures result in anger, frustration and playing the blame game. However, failures are wasted if there is no learning from them. How can Agile teams make failures beautiful?

Commercial Interests Censoring Failures

Topics
Adopting Agile,
Agile in the Enterprise,
Failure,
Agile

Philippe Kruchten described the Agile movement as "The agile movement is in some ways a bit like a teenager: very self-conscious, checking constantly its appearance in a mirror, accepting few criticisms..." and shared a list of twenty elephants in the room - uncomfortable issues that are ignored on purpose. The first of these unmentionables is that commercial interests are censoring failures.

Amazon EC2 Outage Explained and Lessons Learned

Topics
Amazon Web Services,
Amazon,
Operations,
Stories & Case Studies,
IaaS,
Companies,
Agile,
Architecture,
Infrastructure,
Cloud Computing,
Failure,
Amazon RDS

Amazon has published a detailed report on the service failure plaguing one availability zone in the US East Region. The online media is full with analysis, commentaries and lessons to be learned from the event.

Lessons Learned from Skype’s Outage

Topics
Stories & Case Studies,
Automation,
Architecture,
Agile,
Update,
Skype,
Testing,
Failure

On December 22nd, 1600 GMT, the Skype services started to become unavailable, in the beginning for a small part of the users, then for more and more, until the network was down for about 24 hours. A week later, Lars Rabbe, CIO at Skype, explained what happened in a post-mortem analysis of the outage.

Presentations about Failure

Resilient Response In Complex Systems

Topics
QCon London 2012,
QCon,
Patterns and Practices,
Operations,
Patterns,
Conferences,
Failure,
Infrastructure

John Allspaw discusses pitfalls to be avoided while troubleshooting failed systems, comparing web operations at scale with practices in aviation and nuclear power industries.

On Distributed Failures (and handling them with Doozer)

Topics
Strange Loop 2011,
Strange Loop,
Distributed Systems,
Conferences,
Reliability,
Architecture,
Failure

Blake Mizerany presents various ways that can lead to system failure in distributed systems and how to recover using Doozer, a highly available, consistent data store.

Things Break, Riak Bends

Topics
Riak,
Distributed Document Oriented Database,
QCon London 2011,
Fault Tolerance,
QCon,
NoSQL,
Distributed Systems,
Architecture,
Infrastructure,
Conferences,
Failure,
Database

Justin Sheehy talks about failure and the need to prepare for it, giving some real life examples along with techniques implemented in Riak to make it resilient to faults.

Everything I've Ever Learned, I Learned from Failure

Topics
QCon San Francisco 2010,
Agile Techniques,
QCon,
Failure,
Conferences,
Agile,
learning

Robert Myers talks about the role played by failure in Agile development, sharing a number of Lean and Agile practices helping to embrace failure and showing how to interpret the feedback received.

Failures and Successes with Reuse

Topics
Stories & Case Studies,
SOA,
Architecture,
Agile,
Enterprise Architecture,
SOA + Cloud Symposium,
Success,
Failure,
SOA Symposium

Herbjörn Wilhelmsen discusses the reasons why an SOA project failed while trying to reuse existing resources, and how it succeeded later starting from the same business case with reuse in mind.