InfoQ Homepage DevOps Content on InfoQ
-
Culturing Resiliency with Data: a Taxonomy of Outages
Ranjib Dey overviews the categorization of outages that happened at Uber in the past few years based on root cause types.
-
Certainty among the Chaos
Marco Coulter discusses the capabilities of chaos engineering beyond resiliency to support capacity optimization.
-
The More You Know: a Guide to Understanding Your Systems
Tyler Wells shares how Twilio developed a template that enables them to understand their systems better, identify critical metrics to watch, and how to use Chaos Engineering to verify it all.
-
Convergence of Chaos Engineering and Revolutionized Technology Techniques
Yury Niño Roa explores how emerging paradigms can use Chaos Engineering to manage the pains in the path toward providing a solution, showing how Chaos Engineering can benefit from AI.
-
The Past, Present, and Future of Cloud Native API Gateways
Daniel Bryant discusses the evolution of API gateways over the past ten years, current challenges of using Kubernetes, strategies for exposing services and APIs, the (potential) future of gateways.
-
Let Devs Be Devs: Abstracting away Compliance and Reliability to Accelerate Modern Cloud Deployments
Rahul Arya shares how they built a platform to abstract away compliance, make reliability with Chaos Engineering completely self-serve, and enable developers to ship code faster.
-
Can Chaos Coerce Clarity from Compounding Complexity? Certainly
Matt Simons attempts to catch some Black Swans in a system’s architecture and infrastructure, hidden in increased complexity.
-
InfoQ Live Roundtable: Production Readiness: Building Resilient Systems
The panelists discuss observability, security, the software supply chain, CI/CD, chaos engineering, deployment techniques, canaries, blue-green deployments all in the pursuit of production resiliency.
-
Lessons from Incident Management and Postmortems at Atlassian
Jim Severino shares what worked (and didn't work) in incident management and post-mortems for Atlassian.
-
Identifying Hidden Dependencies
Liz Fong-Jones discusses some of the manual experiments they ran at Honeycomb, the bugs discovered in some automatic replacement tools, and what steps they took for continuously running experiments.
-
InfoQ Live Roundtable: Observability Patterns for Distributed Systems
The panelists explore how a sound observability strategy can help mitigate operational costs and avoid common pitfalls in monitoring distributed systems.
-
Automating Chaos Attacks
Daniel Albuquerque and Nikos Katirtzis show how to run attacks in both manual and automated ways.