At this time, we’re saying the general public preview of AWS DevOps Agent, a frontier agent that helps you reply to incidents, establish root causes, and forestall future points by systematic evaluation of previous incidents and operational patterns.
Frontier brokers signify a brand new class of AI brokers which might be autonomous, massively scalable, and work for hours or days with out fixed intervention.
When manufacturing incidents happen, on-call engineers face important strain to shortly establish root causes whereas managing stakeholder communications. They need to analyze information throughout a number of monitoring instruments, overview latest deployments, and coordinate response groups. After service restoration, groups typically lack bandwidth to rework incident learnings into systematic enhancements.
AWS DevOps Agent is your always-on, autonomous on-call engineer. When points come up, it routinely correlates information throughout your operational toolchain, from metrics and logs to latest code deployments in GitHub or GitLab. It identifies possible root causes and recommends focused mitigations, serving to scale back imply time to decision. The agent additionally manages incident coordination, utilizing Slack channels for stakeholder updates and sustaining detailed investigation timelines.
To get began, you join AWS DevOps Agent to your current instruments by the AWS Administration Console. The agent works with common providers similar to Amazon CloudWatch, Datadog, Dynatrace, New Relic, and Splunk for observability information, whereas integrating with GitHub Actions and GitLab CI/CD to trace deployments and their impression in your cloud sources. By the deliver your individual (BYO) Mannequin Context Protocol (MCP) server functionality, you may also combine further instruments similar to your group’s customized instruments, specialised platforms or open supply observability options, similar to Grafana and Prometheus into your investigations.
The agent acts as a digital crew member and may be configured to routinely reply to incidents out of your ticketing techniques. It consists of built-in help for ServiceNow, and thru configurable webhooks, can reply to occasions from different incident administration instruments like PagerDuty. As investigations progress, the agent updates tickets and related Slack channels with its findings. All of that is powered by an clever utility topology the agent builds—a complete map of your system elements and their interactions, together with deployment historical past that helps establish potential deployment-related causes throughout investigations.
Let me present you the way it works
To indicate you the way it works, I deployed a straigthforward AWS Lambda operate that deliberately generates errors when invoked. I deployed it in an AWS CloudFormation stack.
Step 1: Create an Agent House
An Agent House defines the scope of what AWS DevOps Agent can entry because it performs duties.
You possibly can set up Agent Areas primarily based in your operational mannequin. Some groups align an Agent House with a single utility, others create one per on-call crew managing a number of providers, and a few organizations use a centralized method. For this demonstration, I’ll present you the way to create an Agent House for a single utility. This setup helps isolate investigations and sources for that particular utility, making it simpler to trace and analyze incidents inside its context.
Within the AWS DevOps Agent part of the AWS Administration Console, I choose Create Agent House, enter a reputation for this house and create the AWS Identification and Entry Administration (IAM) roles it makes use of to introspect AWS sources in my or others’ AWS accounts.
For this demo, I select to allow the AWS DevOps Agent net app; extra about this later. This may be completed at a later stage.
When prepared, I select Create.
After it has been created, I select the Topology tab.
This view exhibits the important thing sources, entities, and relationships AWS DevOps Agent has chosen as a basis for performing its duties effectively. It doesn’t signify every thing AWS DevOps Agent can entry or see, solely what the Agent considers most related proper now. By default, the Topology consists of the AWS sources which might be contained in my account. As your agent completes extra duties, it’s going to uncover and add new sources to this record.
Step 2: Configure the AWS DevOps net app for the operators
The AWS DevOps Agent net app gives an internet interface for on-call engineers to manually set off investigations, view investigation particulars together with related topology parts, steer investigations, and ask questions on an investigation.
I can entry the net app straight from my Agent House within the AWS console by selecting the Operator entry hyperlink. Alternatively, I can use AWS IAM Identification Middle to configure consumer entry for my crew. IAM Identification Middle lets me handle customers and teams straight or connect with an id supplier (IdP), offering a centralized strategy to management who can entry the AWS DevOps Agent net app.
At this stage, I’ve an Agent House all set as much as focus investigations and sources for this specific utility, and I’ve enabled the DevOps crew to provoke investigations utilizing the net app.
Now that the one-time setup for this utility is finished, I begin invoking the defective Lambda operate. It generates errors at every invocation. The CloudWatch alarm related to the Lambda errors rely activates to ALARM state. In actual life, you may obtain an alert from exterior providers, similar to ServiceNow. You possibly can configure AWS DevOps Agent to routinely begin investigations when receiving such alerts.
For this demo, I manually begin the investigation by deciding on Begin Investigation.
You may as well select from a number of preconfigured beginning factors to shortly start your investigation: Newest alarm to analyze your most up-to-date triggered alarm and analyze the underlying metrics and logs to find out the foundation trigger, Excessive CPU utilization to analyze excessive CPU utilization metrics throughout your compute sources and establish which processes or providers are consuming extreme sources, or Error price spike to analyze the latest enhance in utility error charges by analyzing metrics, utility logs, and figuring out the supply of failures.
I enter some info, similar to Investigation particulars, Investigation start line, the Date and time of the incident, the AWS Account ID for the incident.
Within the AWS DevOps Agent net app, you may watch the investigation unfold in actual time. The agent identifies the applying stack. It correlates metrics from CloudWatch, examines logs from CloudWatch Logs or exterior sources, similar to Splunk, critiques latest code modifications from GitHub, and analyzes traces from AWS X-Ray.
It identifies the error patterns and gives an in depth investigation abstract. Within the context of this demo, the investigation reveals that these are intentional take a look at exceptions, exhibits the timeline of operate invocations resulting in the alarm, and even suggests monitoring enhancements for error dealing with.
The agent makes use of a devoted incident channel in Slack, notifies on-call groups if wanted, and gives real-time standing updates to stakeholders. By the investigation chat interface, you may work together straight with the agent by asking clarifying questions similar to “which logs did you analyze?” or steering the investigation by offering further context, similar to “give attention to these particular log teams and rerun your evaluation.” In the event you want knowledgeable help, you may create an AWS Assist case with a single click on, routinely populating it with the agent’s findings, and have interaction with AWS Assist specialists straight by the investigation chat window.
For this demo, the AWS DevOps Agent accurately recognized guide actions within the Lambda console to invoke a operate that deliberately triggers errors 😇.
Past incident response, AWS DevOps Agent analyzes my latest incidents to establish high-impact enhancements that stop future points.
Throughout energetic incidents, the agent presents rapid mitigation plans by its incident mitigations tab to assist restore service shortly. Mitigation plans encompass specs that present detailed implementation steerage for builders and agentic improvement instruments like Kiro.
For longer-term resilience, it identifies potential enhancements by analyzing gaps in observability, infrastructure configurations, and deployment pipeline. My simple demo that triggered intentional errors was not sufficient to generate related suggestions although.
For instance, it would detect {that a} important service lacks multi-AZ deployment and complete monitoring. The agent then creates detailed suggestions with implementation steerage, contemplating elements like operational impression and implementation complexity. In an upcoming fast follow-up launch, the agent will increase its evaluation to incorporate code bugs and testing protection enhancements.
Availability
You possibly can attempt AWS DevOps Agent right now within the US East (N. Virginia) Area. Though the agent itself runs in US East (N. Virginia) (us-east-1), it could actually monitor functions deployed in any Area, throughout a number of AWS accounts.
In the course of the preview interval, you need to use AWS DevOps Agent at no cost, however there will probably be a restrict on the variety of agent activity hours monthly.
As somebody who has spent numerous nights debugging manufacturing points, I’m notably enthusiastic about how AWS DevOps Agent combines deep operational insights with sensible, actionable suggestions. The service helps groups transfer from reactive firefighting to proactive system enchancment.
To study extra and join the preview, go to AWS DevOps Agent. I look ahead to listening to how AWS DevOps Agent helps enhance your operational effectivity.









