InfoQ Homepage DevOps Content on InfoQ
-
The Ironies of A^2 I^2
J. Paul Reed explains the "ironies of automation" and AI in incident response. He discusses how reliance on AI can erode manual skills and camouflage system failures during high-stakes outages.
-
Powering the Future: Building Your GenAI Infrastructure Stack
Merrin Kurian discusses Intuit’s GenOS, a generative AI operating system powering agents for 100M users. She explains the transition from chat assistants to "done-for-you" autonomous experiences.
-
Evolution of a Backend for a Streaming Application
Daniele Frasca shares how to scale streaming apps for millions of users using serverless patterns. Learn to eliminate single points of failure, improve data consistency, and master multi-region.
-
How Netflix Shapes our Fleet for Efficiency and Reliability
Joseph Lynch and Argha C. discuss how Netflix balances hardware supply and software demand. They explain techniques like risk-adjusted net value, buffer management, and priority-based load shedding.
-
AI-Powered SRE for Autonomous Incident Response
The presenters discuss incident response, how AI-enhanced SRE platforms connect signals from logs, metrics, traces, and historical incidents to enable autonomous decisions.
-
Week-Long Outage: Lifelong Lessons
Molly Struve shares a "murder mystery" outage story from a massive Elasticsearch upgrade. She explains why you need a rollback plan, how to check biases, and why leadership support is a stabilizer.
-
Building a Future-Proof Observability Platform to Empower Engineers
Wayne Bell and Dan Gomez Blanco explain how Skyscanner transitioned from siloed telemetry to a unified OpenTelemetry standard, treating their internal platform as a product to drive adoption.
-
From VR to Flat Screens: Bridging the Input and Immersion Gap
Dany Lepage explains how Lucky VR scaled "Vegas Infinite" from Meta Quest to PS5, PC, and mobile. He shares the technical hurdles of cross-play, dual avatar systems, and the "product fit" trap.
-
Platform Engineering: Lessons from the Rise and Fall of eBay Velocity
Randy Shoup shares how eBay doubled engineering productivity but failed to pivot the business. He explains the technical wins of the Velocity Initiative and the cultural hurdles that remained.
-
Duolingo's Kubernetes Leap
Franka Passing explains Duolingo's migration from AWS ECS to EKS, discussing how they built a foundation with Argo CD and Karpenter to enable blue-green deployments for 128M+ active users.
-
Platform Engineering as a Practice of Sociotechnical Excellence
Lesley Cordero explains how platform engineering serves as a sociotechnical solution for scaling orgs. She shares strategies for joint optimization, communal learning, and distributed leadership.
-
No QA Environment? No Problem: How Classpass Enables Testing on a Single Environment in ECS
Po Linn Chia explains how ClassPass eliminated environment contention using ECS, Traefik, and OpenTelemetry baggage to enable scalable, ephemeral testing without a dedicated QA environment.