Sadek Drobi talks about the prismic.io API and how to understand the properties and the mechanics of a system, and to partition its different dimensions to avoid a domino style failure cascade.
Pete Smith shares from his experience, discussing what it means to fail and how to make the most of it
Tammer Saleh talks about the mistakes made building microservices, when microservices are appropriate, where to draw the lines between services, performance issues, testing, debugging, failure, etc.
Fangjin Yang covers common problems and failures seen with distributed systems, and discusses design patterns that can be used to maintain data integrity and availability when everything goes wrong.
Nate Fink shares how Yammer has changed everything from how they structure teams to the role of managers to how they measure progress so they can not only survive but thrive learning.
Jon Moore goes over some strategies for surviving in a jungle of partial failures. Each survival tip is explained through a concrete example, or "adventure story", from Comcast’s TV experience.
Matt Buckland discusses some of the cultures he has encountered in his work experience, the success stories and the failures, outlining what makes a great organizational culture.
Matt Heath discusses how circuit breakers and other similar patterns can be used to increase reliability in distributed systems such as Go-based microservice platforms.
Tom Limoncelli discusses creating resiliency at the most economic level, doing risky procedures often, and creating a blameless culture to encourage communication and improve system reliability.
Kolton Andrus presents how Netflix, in order to harden their systems, designed “Failure as a Service” to allow anyone to test and validate how their systems handle failure.
York Xyander, Bodo Junglas discuss strategies for service discoverability and transparent failover in a microservices architecture, how to achieve zero downtime and an auto-scaling architecture.
Shobana Radhakrishnan shares details about best practices adopted in implementing API integration with third party services, how to manage change and deal with failures.