InfoQ Homepage Fault Tolerance Content on InfoQ

Presentations

RSS Feed

Newer Older

Architecting for High Availability

Attila Narin discusses AWS concepts: Availability Zones, RDS Multi-AZ deployments, SQS and Auto Scaling, Elastic IP, load balancing, DNS, DynamoDB, Amazon S3, etc., and EC2 best practices.

Attila Narin
on May 14, 2013

Icon

48:02
Designing Fault Tolerant Distributed Applications

Scott Andreas discussing creating fault tolerant distributed applications, and demoes Ordasity, a framework for building self-organizing systems with services.

Scott Andreas
on Mar 29, 2013

Icon

48:07
Runaway Complexity in Big Data, and a Plan to Stop It

Nathan Marz outlines several sources of complexity introduced in data systems - Lack of human fault-tolerance, Conﬂation of data and queries, Schemas done wrong - and what can be done to avoid them.

Nathan Marz
on Oct 25, 2012

Icon

48:54
Erlang's Open Telecom Platform (OTP) Framework

Steve Vinoski introduces Erlang’s OTP Frmework, outlining some of its main features, including several behaviors – implementations of common patterns useful for concurrent fault-tolerant applications.

Steve Vinoski
on Aug 24, 2012

Icon

01:05:45
Storm: Distributed and Fault-tolerant Real-time Computation

Nathan Marz discusses Storm concepts –streams, spouts, bolts, topologies-, explaining how to use Storms’ Clojure DSL for real-time stream processing, distributed RPS and continuous computations.

Nathan Marz
on Jul 25, 2012

Icon

42:26
Anomaly Detection, Fault Tolerance and Anticipation Patterns

John Allspaw discusses fault tolerance, anomaly detection and anticipation patterns helpful to create highly available and resilient systems.

John Allspaw
on May 30, 2012

Icon

53:34
Techniques for Scaling the Netflix API

Daniel Jacobson covers the history of Netflix’s APIs, adaptation for the cloud, development and testing, resiliency, and the future of their APIs.

Daniel Jacobson
on Apr 30, 2012

Icon

57:31
Architecting for Failure at the Guardian.co.uk

Michael Brunton-Spall talks about various types of system failure that can happen, sharing the lessons learned at the Guardian and measures taken to prevent and mitigate failure.

Michael Brunton-Spall
on Apr 25, 2012

Icon

01:00:57
Building Highly Available Systems in Erlang

Joe Armstrong discusses highly available (HA) systems, introducing different types of HA systems and data, HA architecture and algorithms, 6 rules of HA, and how HA is done with Erlang.

Joe Armstrong
on Apr 19, 2012

Icon

59:30
Storm: Distributed and Fault-tolerant Real-time Computation

Nathan Marz explain Storm, a distributed fault-tolerant and real-time computational system currently used by Twitter to keep statistics on user clicks for every URL and domain.

Nathan Marz
on Oct 21, 2011

Icon

49:37
Above the Clouds: Introducing Akka

Jonas Bonér introduces Akka, a JVM platform that wants to address the complex problems of concurrency, scalability and fault tolerance using Actors, STM and self-healing from crashes.

Jonas Bonér
on Aug 15, 2011

Icon

01:01:18
Things Break, Riak Bends

Justin Sheehy talks about failure and the need to prepare for it, giving some real life examples along with techniques implemented in Riak to make it resilient to faults.

Justin Sheehy
on Aug 09, 2011

Icon

58:47

Newer Presentations

Older Presentations

InfoQ Software Architects' Newsletter

Presentations