InfoQ Homepage Resilience Content on InfoQ
-
Containers Change Everything
Anne Currie talks about the architectural impact of containers, and what modern container schedulers mean for resilience, redundancy and server density.
-
Distributed Consensus: Making the Impossible Possible
Heidi Howard explores how to construct resilient distributed systems on top of unreliable components. Howard discusses which algorithms are best suited to different situations.
-
Resilient Predictive Data Pipelines
Sid Anand discusses how Agari is applying big data best practices to the problem of securing its customers from email-born threats, presenting a system that leverages big data in the cloud.
-
Resilience Planning & How the Empire Strikes Back
Bhakti Mehta approaches best practices for building resilient, stable and predictable services: preventing cascading failures, timeouts pattern, retry pattern, circuit breakers and other techniques.
-
Building Highly-resilient Systems at Pinterest
Yongsheng Wu talks about how to build highly-resilient systems at scale. Wu presents also failure cases that prompted engineers at Pinterest to build such systems, and how they test these systems.
-
Fail Better: Radical Ideas from the Practice of Cloud Computing
Tom Limoncelli discusses creating resiliency at the most economic level, doing risky procedures often, and creating a blameless culture to encourage communication and improve system reliability.
-
Resilience, Service Discovery and Zero Downtime Deployment in Microservice Architectures
York Xyander, Bodo Junglas discuss strategies for service discoverability and transparent failover in a microservices architecture, how to achieve zero downtime and an auto-scaling architecture.
-
Responding Rapidly When You Have 100GB+ Data Sets in Java
Peter Lawrey discusses data-driven reactive systems, profiling latency distribution in such an environment, finding rare bugs, implementing resilience and monitoring.
-
Opportunities to Improve System Reliability and Resilience
Donald Belcham explains how to improve a system’s reliability by using appropriate code patterns.
-
You Won't Believe How the Biggest Sites Build Scalable and Resilient Systems!
The authors discuss about the lessons learned from all the biggest sites on the internet about how to build scalable and resilient architectures.
-
Going Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Jonas Bonér discusses four key traits of Reactive Apps: Event-Driven, Scalable, Resilient and Responsive, how they impact application design, how they interact, related technologies and techniques.
-
Building Resilience: How Outages Shaped Etsy's Systems
Avleen Vig presents some of the most unexpected, confusing, hilarious and face-palming events during Etsy's outages to show what can be learnt from their problems to build more resilient systems.