InfoQ Homepage DevOps Content on InfoQ
-
From Monitoring to Observability: eBPF Chaos
Michael Friedrich discusses the learning steps with eBPF and traditional metrics monitoring and future Observability data collection, storage and visualization.
-
How to Build a Successful Cloud Capability on a Heavy Regulated Organization
Ana Sirvent discusses their cloud capability journey, highlighting lessons learned and best practices on culture, processes and technology.
-
Demystifying Kubernetes Platforms with Backstage
Matt Clarke discusses how Spotify's deployments infrastructure team integrated Kubernetes with Backstage to streamline developer productivity and how you can do the same.
-
The Commoditization of the Software Stack: How Application-First Cloud Services are Changing the Game
Bilgin Ibryam discusses the intersection of cloud-native technologies such as Dapr with developer-focused cloud services.
-
Is Your Java Application Slow? Check out These Open-Source Profilers
Johannes Bechberger focuses on understanding the basic concepts of profiling like flame graphs, usage of async-profiler and JMC, advantages and disadvantages of the different tools.
-
The Eternal Sunshine of the Toil-Less Prod
Sasha Rosenbaum discusses the evolution from shipping products to running services, and what she learned while trying different approaches.
-
Celebrity Vulnerabilities: Effective Response to Critical Production Threats
Alyssa Miller dives into the lessons learned from three major open source security events, the Equifax breach via Struts, the Log4j vulnerabilities and the Spring4Shell exploit.
-
Did the Chaos Test Pass?
Christina Yakomin discusses how to run Chaos experiments with Vanguard technologies.
-
From Cloud-Hosted to Cloud-Native
Rosemary Wang discusses the patterns and practices that help one move from cloud-hosted to cloud-native architecture and maximize the benefit and use of the cloud.
-
An Open Source Infrastructure for PyTorch
Mark Saroufim discusses tools and techniques to deploy PyTorch in production.
-
How Did It Make Sense at the Time? Understanding Incidents as They Occurred, Not as They are Remembered
Jacob Scott explores the basics of failure in complex systems, the theory and practice of how it made sense at the time, and actions to take.
-
Effective and Efficient Observability with OpenTelemetry
Daniel Gomez Blanco shares his experience leading a large-scale observability initiative at Skyscanner, based on the adoption of OpenTelemetry across hundreds of services.