InfoQ Homepage DevOps Content on InfoQ
-
Scaling Instagram Infrastructure
Lisa Guo overviews Instagram's infrastructure, its history, multi-data center support, tuning uwsgi parameters for scaling, performance monitoring and diagnosis, and Django/Python upgrade.
-
Hybrid Code-Gen: Designing Cloud Service Client Libraries
Jon Skeet discusses using hybrid code generating to create cloud client libraries in a way that does not affect the future evolution of a service API.
-
Building Reliability in an Unreliable World
Greg Murphy describes how GameSparks has designed their platform to be tolerant of many things: unreliable and slow internet connectivity, cloud resources that can fail without warning, and more.
-
Challenging Perceptions of NHS IT
Edward Hiley, Dan Rathbone talk about how NHS Digital has built a highly secure and resilient system for processing patient data, applying techniques more often used in the cloud to bare metal servers
-
Building and Trusting a Cloud Bank
Greg Hawkins discusses how Starling Bank, part of the new movement in FinTech challenger banks, is innovating while addressing the need for resilience in a world where failure is everywhere.
-
Testing Programmable Infrastructure with Ruby
Matt Long talks about some approaches to environment infrastructure testing that his team at OpenCredo has created using Ruby.
-
Big Data Infrastructure @ LinkedIn
Shirshanka Das describes LinkedIn’s Big Data Infrastructure and its evolution through the years, including details on the motivation and architecture of Gobblin, Pinot and WhereHows.
-
From Data Science to Production–Deploy, Scale, Enjoy
Sergii Khomenko introduces best practices in development, covers production deployments to the AWS stack, and using the serverless architecture for data applications.
-
Automating Chaos Experiments in Production
Ali Basiri discusses the motivation behind ChAP (Chaos Automation Platform), how they implemented it, and how Netflix service teams are using it to identify systemic weaknesses.
-
Creating a Collaborative Culture between Dev & Ops
Pedro Canahuati discusses some of the ways the Production Engineering (PE) team at Facebook has worked on building a collaborative culture between the software and operations teams.
-
Winston: Helping Netflix Engineers Sleep at Night
Sayli Karmarkar discusses Winston, a monitoring and remediation platform built for Netflix engineers.
-
Incident Management at the Edge
Lisa Phillips discusses the typical struggles a company runs into when building around-the-clock incident operations and the things Fastly has put in place to make dealing with incidents easier.