InfoQ Homepage Operations management Content on InfoQ
-
MLOps: the Most Important Piece in the Enterprise AI Puzzle
Francesca Lazzeri overviews the latest MLOps technologies and principles that data scientists and ML engineers can apply to their machine learning processes.
-
Developing and Deploying ML across Teams with MLOps Automation Tool
Fabio Grätz and Thomas Wollmann discuss the MLOps Automation tool, and how it can be used to perform DevOps tasks on ML across teams.
-
Iterating on Models on Operating ML
Monte Zweben and Roland Meertens discuss the challenges in building, maintaining, and operating machine learning models.
-
Production & Debugging in a Serverless World
Tal Weiss covers some of the main things to watch out for and the advanced techniques we can put in place to make sure that we'll be prepared to debug even the nastiest Serverless production issues.
-
Top Five Things You Can Do to Reduce Operational Load
Rachel Obstler discusses the things one can do to make a big difference in reducing operational work from incidents, reducing duplicate efforts, surfacing issues, and improving response times.
-
Managing Systems in an Age of Dynamic Complexity
Laura Nolan looks at the common architectural shapes of dynamic control planes, and some examples of how they fail. Why are dynamic control planes so hard to run, and what can be done about it?
-
Evolution of Edge @Netflix
Vasily Vlasov reviews Netflix’s edge gateway ecosystem - multiple traffic gateways performing different functions deployed around the world.
-
Observability to Better Serverless Apps
Erica Windisch dives into how serverless development with observability tooling can help bridge the gap between operations and business intelligence to learn better and iterate faster.
-
Operational Considerations for Containers
Chris Swan discusses how to deal with container operational considerations regarding image management, security, audit, logging, orchestration, and how that relates back to developer experience.
-
Incident Management at the Edge
Lisa Phillips discusses the typical struggles a company runs into when building around-the-clock incident operations and the things Fastly has put in place to make dealing with incidents easier.
-
Keep Calm and Carry on: Scaling Your Org
Charity Majors talks about what it means to do quality operations and software engineering in the year 2016 and beyond, as well as the implications for engineering teams and social systems.
-
Autonomous Operations: Microservices, ML and AI
Rob Harrop discusses the increasing automated field of operations and what the future might hold when machine learning and AI techniques are brought to bear on the problem of systems operations.