InfoQ Homepage Fault Tolerance Content on InfoQ

Presentations

RSS Feed

Newer Older

Architecture & Design

Monkeys in Lab Coats: Applying Failure Testing Research @Netflix

The authors present how lineage-driven fault injection evolved from a theoretical model into an automated failure testing system that leverages Netflix’s fault injection and tracing infrastructures.

Kolton Andrus Peter Alvaro
on Mar 24, 2016

Icon

43:44
Architecture & Design

Scaling Distributed Systems

Natalia Chechina outlines features of actor and functional programming models, and the reason these models attract so much interest in parallel, concurrent, and scaling world.

Natalia Chechina
on Oct 04, 2015

Icon

32:29
Distributed Eventually Consistent Computations

Christopher Meiklejohn looks at applying two techniques together, deterministic data flow programming and conflict-free replicated data types, to create highly available and fault-tolerant systems.

Christopher Meiklejohn
on Aug 15, 2015

Icon

48:01
Distributed Scheduling with Apache Mesos in the Cloud

Diptanu Choudhury discusses the design of Netflix’ distributed scheduler based on Mesos and Titan, focusing on bin packing algorithms, scaling in and out of clusters, fault tolerance, and redundancy.

Diptanu Choudhury
on Aug 02, 2015

Icon

58:24
Thinking in a Highly Concurrent, Mostly-functional Language

Francesco Cesarini illustrates how the Erlang way of thinking about problems leads to scalable and fault-tolerant designs, describing 3 ways of clustering Erlang nodes within the server side domain.

Francesco Cesarini
on Apr 08, 2015

Icon

48:58
Tumblr - Bits to Gifs

John Bunting talks about different services Tumblr has built and how their architecture helps them be fault tolerant as they continue to grow.

John Bunting
on Dec 25, 2014

Icon

36:48
Fault Tolerance 101

Joe Armstrong discusses fault tolerant systems, summarizing the key features of Erlang and showing how they can be used for programming fault-tolerant and scalable systems on multi-core clusters.

Joe Armstrong
on Jun 05, 2014

Icon

52:21
Fault Tolerance Made Easy

Uwe Friedrichsen discusses implementing resilient software design patterns (code included) and improving those patterns to achieve robustness and becoming a resilient software developer.

Uwe Friedrichsen
on Jun 03, 2014

Icon

45:19
Fault Tolerance 101

Joe Armstrong discusses how fault tolerance relates to scalability and concurrency, and how Erlang helps build fault-tolerant systems on multi-core clusters.

Joe Armstrong
on May 25, 2014

Icon

53:41
Programming, Only Better

Bodil Stokke keynotes on the FP languages for writing bug free, fault tolerant code that help building simple, concurrent and reusable software.

Bodil Stokke
on Mar 19, 2014

Icon

01:04:39
Architecting for High Availability

Attila Narin discusses AWS concepts: Availability Zones, RDS Multi-AZ deployments, SQS and Auto Scaling, Elastic IP, load balancing, DNS, DynamoDB, Amazon S3, etc., and EC2 best practices.

Attila Narin
on May 14, 2013

Icon

48:02
Designing Fault Tolerant Distributed Applications

Scott Andreas discussing creating fault tolerant distributed applications, and demoes Ordasity, a framework for building self-organizing systems with services.

Scott Andreas
on Mar 29, 2013

Icon

48:07