InfoQ Homepage DevOps Content on InfoQ
-
Solving Mysteries Faster with Observability
Elizabeth Carretto discusses observability at Netflix and where and how their internal tool, Edgar, comes into play.
-
A Sticky Situation: How Netflix Gains Confidence in Changes
Haley Tucker discusses sticky canaries, what they are and how they can help, and how to build confidence in changes.
-
Paving the Road to Production
Graham Jenson shares his experience of creating "paved roads" and deploying pipelines at Coinbase for the past five years, and what the advantages of doing that are.
-
Greenwater, Washington: an Availability Story
Marc Brooker discusses defining and designing for availability that takes people into account, including examples of massive-scale cloud systems designed using these principles.
-
Failing Fast: the Impact of Bias When Speeding up Application Security
Laura Bell explores how bias impacts the security of a development lifecycle and examines 3 common biases that lead to big issues in this space.
-
Cloud Native Is about Culture, Not Containers
Holly Cummins shares stories of customers struggling to get cloud native and all the ways things can go wrong.
-
Leading Technical Projects - and How to Get Them Done
Sarah Wells shares stories on how the Operations and Reliability team at the FT built tools that are used by lots of their development teams: the challenges they faced, the things they tried and more.
-
Production & Debugging in a Serverless World
Tal Weiss covers some of the main things to watch out for and the advanced techniques we can put in place to make sure that we'll be prepared to debug even the nastiest Serverless production issues.
-
Scaling Culture of Resiliency in the Enterprise
Nate Vogel shares how he grew the data engineering team with an emphasis on building a culture of reliability, discussing processes and tools used.
-
IBM’s Principles of Chaos Engineering
Haytham Elkhoja discusses the process of getting engineers from across to agree on a list of Chaos Engineering principles, adapting existing principles to customer requirements and internal services.
-
Armor CLAD Functions
Guy Podjarny talks about how to properly secure our cloud functions. He uses a model called CLAD to remember what's left to protect, and discusses concrete practices to scale our defences.
-
Top Five Things You Can Do to Reduce Operational Load
Rachel Obstler discusses the things one can do to make a big difference in reducing operational work from incidents, reducing duplicate efforts, surfacing issues, and improving response times.