John Allspaw provides a glimpse into how other fields handle incident response, including active steps companies can take to support engineers in those uncertain and ambiguous scenarios.
Josh Evans discusses architectural patterns used by Netflix to enable seamless, multi-region traffic management, reliable, fast data propagation, and efficient service infrastructure.
Josh Evans uses the Netflix Operations Engineering team as a case study to explore the challenges faced by centralized engineering teams and approaches to addressing those challenges.
Michael Brunton-Spall shows how DevOps-like patterns can be applied on microservices to give the development teams more responsibility for their choices, and much more.
Dustin Huptas, Andreas Schmidt present some of the operational challenges met when dealing with microservices, and offer solutions from the field of automation and service discovery.
John Wilkes shares lessons learned managing clusters at the scale of Google.
Robert Benefield offers a pragmatic overview for discovering operational indicators that provide valuable insight in running and improving online services.
Pedro Canahuati describes how Facebook's operations maintains their infrastructure, including challenges faced and lessons learned: prioritizing calls, managing technical debt, incident management.
Ben Christensen describes Netflix API's evolution to a web service platform serving all devices and users, the challenges met in operations, deployment, performance, fault-tolerance, and innovation.
Joe Sondow presents how Netflix uses Asgard to deploy code updates and manage resources in the Amazon cloud.
Roy Rapoport discusses how Netflix uses metrics to monitor and manage their operating environment along with some notes about their event management system.
Filippos Santas explains how to apply service-orientation principles, patterns, processes and SOA governance precepts to ITIL's service lifecycle stages, key processes and activities.