AWS Announces General Availability of DevOps Agent for Automated Incident Investigation

AWS has announced the general availability of DevOps Agent, a generative AI–powered assistant designed to help developers and operators troubleshoot issues, analyze deployments, and automate operational tasks across AWS environments.

Introduced in preview at re:Invent 2025 and built on Amazon Bedrock AgentCore, DevOps Agent analyzes incidents by learning application relationships and integrating with observability tools, runbooks, code repositories, and CI/CD pipelines. The agent correlates telemetry, code, and deployment data to autonomously triage issues, speed up resolution, and identify patterns in past incidents to recommend improvements that help prevent future outages. Announcing the general availability, Madhu Balaji, senior solution architect at AWS, writes:

A SRE responding to a 2 AM page must manually correlate telemetry from multiple sources, trace dependencies across services, and form hypotheses — a process that routinely takes hours. As systems grow in complexity, the need for an AI-powered operational teammate — an SRE agent — has become increasingly clear.

The main improvements introduced with the general availability have been the ability to investigate applications in Azure and on-prem environments, support for custom agent skills to extend capabilities, and custom charts and reports. Balaji adds:

DevOps Agent is not a passive Q&A tool, it is an autonomous teammate. When an incident triggers via a CloudWatch alarm, PagerDuty alert, Dynatrace Problem, ServiceNow ticket, or any other event source you configure through the webhook, the agent begins investigating immediately without human prompting.

In a separate article, using a serverless URL shortener application as an example, Janardhan Molumuri, Bill Fine, Joe Alioto, and Tipu Qureshi explain how to leverage agentic AI for autonomous incident response with DevOps Agent. They write:

Extensibility through the MCP and built-in integrations with CloudWatch, Datadog, Dynatrace, New Relic, Splunk, Grafana, GitHub, GitLab, and Azure DevOps ensures the agent can pull signals from wherever the team’s operational data lives.

AWS DevOps Agent

Source: AWS blog

According to the cloud provider, DevOps teams often start incident investigations using AI coding tools connected to logs and monitoring systems, but these tools lack the broader context and operational controls needed to manage complex production environments at scale. Sebastian Korfmann, co-creator of Agentic Hamburg, writes:

The early numbers are compelling: up to 75% lower MTTR and 94% root cause accuracy in preview. Integrates with Datadog, Grafana, Splunk, PagerDuty, ServiceNow, and more.

Corey Quinn, chief cloud economist at The Duckbill Group, comments:

You're paying for the privilege of having AI do what your 2 AM on-call engineer does, except it won't passive-aggressively Slack the team about it afterward. MTTR drops from hours to minutes; invoices go from minutes to hours.

In a popular Reddit thread, many developers question the lack of an accountability model, with user The_Flexing_Dude asking:

Is that the same one that dropped a production environment last month?

With general availability, the service is no longer free, with the pricing based on the cumulative time the agent spends on operational tasks, billed per second. AWS Support customers receive monthly DevOps Agent credits based on their previous month’s support spending, with the percentage of the credits available based on the support level. The service is currently available across six regions, including Northern Virginia, Ireland, and Frankfurt.

In a separate announcement, AWS made Security Agent on-demand penetration testing generally available. The AI-powered agent continuously analyzes application design, code, and runtime behavior to automatically perform on-demand penetration testing and identify exploitable security vulnerabilities.

About the Author

Renato Losio

Show moreShow less

InfoQ Software Architects' Newsletter

Follow us on

About the Author

Renato Losio

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter