Joe Armstrong describes the foundations of fault tolerant computation and the basic properties a system should have in order to be able to function in an adequate manner despite the occurrence of hardware and software errors, summarizing the key features of Erlang and showing how they can be used for programming fault-tolerant and scalable systems on multi-core clusters.
Uwe Friedrichsen discusses several easy to implement resilient software design patterns, when to use them and how to actually implement them - code included along with options to extend and improve those patterns in order to make an application more robust step by step in order to achieve the next level after agile and clean code: Becoming a resilient software developer!
Joe Armstrong discusses how fault tolerance relates to scalability and concurrency, and how Erlang helps build fault-tolerant systems on multi-core clusters.
Bodil Stokke keynotes on the FP languages for writing bug free, fault tolerant code that help building simple, concurrent and reusable software.
Attila Narin discusses AWS concepts: Availability Zones, RDS Multi-AZ deployments, SQS and Auto Scaling, Elastic IP, load balancing, DNS, DynamoDB, Amazon S3, etc., and EC2 best practices.
Scott Andreas discussing creating fault tolerant distributed applications, and demoes Ordasity, a framework for building self-organizing systems with services.
Nathan Marz outlines several sources of complexity introduced in data systems - Lack of human fault-tolerance, Conﬂation of data and queries, Schemas done wrong - and what can be done to avoid them.
Steve Vinoski introduces Erlang’s OTP Frmework, outlining some of its main features, including several behaviors – implementations of common patterns useful for concurrent fault-tolerant applications.
Nathan Marz discusses Storm concepts –streams, spouts, bolts, topologies-, explaining how to use Storms’ Clojure DSL for real-time stream processing, distributed RPS and continuous computations.
John Allspaw discusses fault tolerance, anomaly detection and anticipation patterns helpful to create highly available and resilient systems.
Daniel Jacobson covers the history of Netflix’s APIs, adaptation for the cloud, development and testing, resiliency, and the future of their APIs.
Michael Brunton-Spall talks about various types of system failure that can happen, sharing the lessons learned at the Guardian and measures taken to prevent and mitigate failure.