InfoQ Homepage Fault Tolerance Content on InfoQ
-
Thinking in a Highly Concurrent, Mostly-functional Language
Francesco Cesarini illustrates how the Erlang way of thinking about problems leads to scalable and fault-tolerant designs, describing 3 ways of clustering Erlang nodes within the server side domain.
-
Tumblr - Bits to Gifs
John Bunting talks about different services Tumblr has built and how their architecture helps them be fault tolerant as they continue to grow.
-
Fault Tolerance 101
Joe Armstrong discusses fault tolerant systems, summarizing the key features of Erlang and showing how they can be used for programming fault-tolerant and scalable systems on multi-core clusters.
-
Fault Tolerance Made Easy
Uwe Friedrichsen discusses implementing resilient software design patterns (code included) and improving those patterns to achieve robustness and becoming a resilient software developer.
-
Fault Tolerance 101
Joe Armstrong discusses how fault tolerance relates to scalability and concurrency, and how Erlang helps build fault-tolerant systems on multi-core clusters.
-
Programming, Only Better
Bodil Stokke keynotes on the FP languages for writing bug free, fault tolerant code that help building simple, concurrent and reusable software.
-
Architecting for High Availability
Attila Narin discusses AWS concepts: Availability Zones, RDS Multi-AZ deployments, SQS and Auto Scaling, Elastic IP, load balancing, DNS, DynamoDB, Amazon S3, etc., and EC2 best practices.
-
Designing Fault Tolerant Distributed Applications
Scott Andreas discussing creating fault tolerant distributed applications, and demoes Ordasity, a framework for building self-organizing systems with services.
-
Runaway Complexity in Big Data, and a Plan to Stop It
Nathan Marz outlines several sources of complexity introduced in data systems - Lack of human fault-tolerance, Conflation of data and queries, Schemas done wrong - and what can be done to avoid them.
-
Erlang's Open Telecom Platform (OTP) Framework
Steve Vinoski introduces Erlang’s OTP Frmework, outlining some of its main features, including several behaviors – implementations of common patterns useful for concurrent fault-tolerant applications.
-
Storm: Distributed and Fault-tolerant Real-time Computation
Nathan Marz discusses Storm concepts –streams, spouts, bolts, topologies-, explaining how to use Storms’ Clojure DSL for real-time stream processing, distributed RPS and continuous computations.
-
Anomaly Detection, Fault Tolerance and Anticipation Patterns
John Allspaw discusses fault tolerance, anomaly detection and anticipation patterns helpful to create highly available and resilient systems.