Designing Fault Tolerant Distributed Applications

Posted by Scott Andreas  on  Mar 29, 2013 2

Scott Andreas discussing creating fault tolerant distributed applications, and demoes Ordasity, a framework for building self-organizing systems with services.

Runaway Complexity in Big Data, and a Plan to Stop It

Posted by Nathan Marz  on  Oct 25, 2012

Nathan Marz outlines several sources of complexity introduced in data systems - Lack of human fault-tolerance, Conflation of data and queries, Schemas done wrong - and what can be done to avoid them.

Erlang's Open Telecom Platform (OTP) Framework

Posted by Steve Vinoski  on  Aug 24, 2012 1

Steve Vinoski introduces Erlang’s OTP Frmework, outlining some of its main features, including several behaviors – implementations of common patterns useful for concurrent fault-tolerant applications.

Storm: Distributed and Fault-tolerant Real-time Computation

Posted by Nathan Marz  on  Jul 25, 2012

Nathan Marz discusses Storm concepts –streams, spouts, bolts, topologies-, explaining how to use Storms’ Clojure DSL for real-time stream processing, distributed RPS and continuous computations.

Anomaly Detection, Fault Tolerance and Anticipation Patterns

Posted by John Allspaw  on  May 30, 2012 1

John Allspaw discusses fault tolerance, anomaly detection and anticipation patterns helpful to create highly available and resilient systems.

Techniques for Scaling the Netflix API

Posted by Daniel Jacobson  on  Apr 30, 2012

Daniel Jacobson covers the history of Netflix’s APIs, adaptation for the cloud, development and testing, resiliency, and the future of their APIs.

Architecting for Failure at the

Posted by Michael Brunton-Spall  on  Apr 25, 2012

Michael Brunton-Spall talks about various types of system failure that can happen, sharing the lessons learned at the Guardian and measures taken to prevent and mitigate failure.

Building Highly Available Systems in Erlang

Posted by Joe Armstrong  on  Apr 19, 2012 1

Joe Armstrong discusses highly available (HA) systems, introducing different types of HA systems and data, HA architecture and algorithms, 6 rules of HA, and how HA is done with Erlang.

Storm: Distributed and Fault-tolerant Real-time Computation

Posted by Nathan Marz  on  Oct 21, 2011 1

Nathan Marz explain Storm, a distributed fault-tolerant and real-time computational system currently used by Twitter to keep statistics on user clicks for every URL and domain.

Above the Clouds: Introducing Akka

Posted by Jonas Bonér  on  Aug 15, 2011

Jonas Bonér introduces Akka, a JVM platform that wants to address the complex problems of concurrency, scalability and fault tolerance using Actors, STM and self-healing from crashes.

Things Break, Riak Bends

Posted by Justin Sheehy  on  Aug 09, 2011

Justin Sheehy talks about failure and the need to prepare for it, giving some real life examples along with techniques implemented in Riak to make it resilient to faults.

Message Passing Concurrency in Erlang

Posted by Joe Armstrong  on  May 29, 2010 1

Joe Armstrong explains through Erlang examples that message passage concurrency represents the foundation of scalable fault-tolerant systems.

