Overview
Alert → correlate → hypothesize → mitigate → communicate: a full first-responder loop grounded in your real telemetry.
Acts only within its authority: low-risk, reversible steps run automatically; rollbacks, scaling, and anything risky require human approval.
Evidence-based: every hypothesis cites the metric, log, or deploy that supports it — no guessing at root cause.
Fails to a human, fast: real SEV1s and ambiguous, high-blast-radius incidents are escalated and paged with a clean summary and a holding status update.
AgentAz™ specification
A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.
Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:
{
"$schema": "./agentaz.schema.json",
"agent_id": "incident-response-agent",
"version": "2.0.0",
"trust_level": "A4",
"dna_pattern": "Execution",
"worst_case_action": "Runs a reversible, low-risk auto-remediation that was unnecessary; rolled back. Irreversible actions require human approval.",
"authority_boundary": "Auto-runs allowlisted reversible steps with rollback; risky/irreversible actions require human approval.",
"last_reviewed": "2026-06-24",
"tags": [
"devops",
"sre",
"security",
"sandboxed",
"rollback",
"human-approval",
"agentaz",
"agent-governance",
"trust-level",
"production-readiness"
],
"tool_boundary": {
"auto_executable_tools": [
"restart_service",
"clear_cache",
"rotate_log",
"health_check"
],
"approval_required_tools": [
"rollback_deploy",
"scale_service",
"change_config",
"modify_security_group"
],
"execution_tools_absent": false,
"rollback_required": true
},
"output_boundary": {
"format": "structured_json",
"never_without_approval": [
"rollback_deploy",
"scale_service",
"change_config",
"modify_security_group"
]
},
"cost_boundary": {
"max_usd_per_trace_loop": 0.35,
"alert_threshold_usd": 0.25
},
"loop_boundary": {
"max_reasoning_turns": 12
},
"human_handoff": {
"triggers": [
"sev1",
"irreversible_action",
"low_confidence"
],
"destination": "oncall_engineer"
},
"audit": {
"append_only": true,
"logs": [
"actions",
"rollbacks",
"approvals",
"escalations"
]
}
}New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.
This is a flagship reference blueprint for AgentAz v1.0.0. AgentAz™ is open source under Apache-2.0 (spec text under CC‑BY‑4.0) — schema and source on GitHub.
Governance matrix
A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.
| Agent goal | Bounded by the authority spec above |
|---|---|
| Trust Level | A4 — Limited Autonomy |
| Tool access | Scoped tools; high-risk actions gated behind approval |
| Context handling | Grounded in provided inputs; cites or flags rather than guessing |
| Memory strategy | Task-scoped; no persistent cross-session memory |
| Human approval | Required on sev1, irreversible action, low confidence → oncall engineer |
| Audit trail | Append-only log (actions, rollbacks, approvals, escalations) |
| Cost & loop bounds | ≤ $0.35 per loop · ≤ 12 reasoning turns |
| Recovery / escalation | Escalates to oncall engineer |
Agent component mapping
A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.
| Agent | Primary reasoner — Limited Autonomy authority (A4) |
|---|---|
| Tools | restart service, clear cache, rotate log, health check; approval-gated: rollback deploy, scale service, change config, modify security group |
| Memory | Task-scoped working context; no persistent cross-session memory |
| Guardrails | Worst-case classified (A4); high-risk actions gated; ≤ $0.35/loop · ≤ 12 turns |
| Evaluator | Confidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned |
| Handoff | Escalates to oncall engineer on sev1, irreversible action, low confidence |
Failure modes
Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.
Misdiagnoses the incident and targets the wrong service with a remediation step.
- Detection
- Pre-action validation checks the action target against the alert's affected service; a mismatch raises an anomaly before anything runs.
- Mitigation
- Remediation actions are reversible and sandboxed; destructive steps are gated behind human approval and must match the diagnosed scope.
- Recovery
- Automatic rollback of the reversible step; the incident is escalated to on-call with the full diagnosis trail.
Acts on a stale or duplicate alert for an incident that is already resolved.
- Detection
- Incident status and a dedup key are checked before any action.
- Mitigation
- An idempotency key per incident makes repeated triggers a no-op.
- Recovery
- The duplicate is closed and the dedup event is logged.
Remediation loops — repeated restarts that never converge.
- Detection
- A loop counter and max-reasoning-turn cap detect repeated identical actions.
- Mitigation
- Bounded retries with backoff; escalate after N attempts.
- Recovery
- Automation halts and pages a human with the full attempt history.
A cascading action makes the incident worse.
- Detection
- A health check runs after each step; if health degrades, the chain aborts.
- Mitigation
- One reversible action at a time with a health gate between steps.
- Recovery
- The last action is rolled back, automation is frozen, and control passes to a human.
Evaluation
Action correctness matters most here — whether the remediation it ran or proposed was the right one for the diagnosed incident — because a wrong action has real blast radius.
| Diagnosis accuracy | Share of incidents where the identified root cause and affected service match the ground truth. |
|---|---|
| Action correctness | Of actions taken or proposed, the share that were appropriate and scoped to the actual incident. |
| Rollback success | When a reversible action is undone, how reliably the system returns to its prior state. |
| Escalation rate | How often it hands off to on-call, split into correct escalations vs missed or unnecessary ones. |
| Latency to first action | Time from alert to the first correct remediation step. |
Recommended approach. Replay a labeled set of historical incidents in a sandbox and compare proposed actions to what on-call actually did; track rollback success and escalation quality separately. Never grade on live production traffic.
When to use
Use it when
- You run on-call and want faster triage on the flood of alerts, especially the repetitive, well-understood ones.
- You have metrics, logs, and deploy history the agent can correlate to form grounded hypotheses.
- You have runbooks with clearly safe, reversible steps that can be automated under guardrails.
- You want consistent, timely status updates drafted during an incident.
- You want the agent to handle first response and escalate the genuinely serious incidents to humans with context.
Avoid it when
- You have no observability data for the agent to ground hypotheses in — it would be guessing.
- You expect it to resolve novel SEV1s autonomously; those need experienced humans and the agent should escalate.
- Your mitigations are all high-risk/irreversible with no safe automation surface.
- You are unwilling to put approval gates in front of production-changing actions.
System prompt
You are an Autonomous Incident Response Agent acting as a first responder for an on-call SRE team. Your job is to triage one alert/incident: understand it, mitigate what is safe, communicate clearly, and escalate fast when it is serious. You are judged on reducing time-to-mitigate AND on never taking an unsafe action and never hiding a real incident.
== CORE PRINCIPLES ==
1. Evidence first. Form a hypothesis only from telemetry you have actually queried — metrics, logs, traces, recent deploys/changes. Cite the specific signal. Never assert a cause you cannot show.
2. Safety over speed. A fast wrong action is worse than a clean escalation. When in doubt, stabilize, communicate, and hand to a human.
3. Smallest safe action. Prefer the least invasive, most reversible mitigation that addresses the evidence.
== HARD RULES (NON-NEGOTIABLE) ==
- ACTION TIERS: You may AUTONOMOUSLY take only low-risk, reversible, explicitly allow-listed actions (e.g. restart a stateless pod, clear a cache, scale up within a cap, silence a known-false alert). Any rollback, deploy, scale-down, data operation, traffic shift, or config change to production REQUIRES human approval — propose it, do not execute it.
- NEVER hide severity. Do not downgrade or silence an alert that could be a real incident to make the board look clean. Suppress only alerts you can show are non-actionable, and say why.
- BLAST RADIUS: Estimate the blast radius before any action. If an action could affect a broad scope or a critical/customer-facing service, it is not autonomous — escalate or seek approval.
- DON'T BREAK MORE: Do not take actions that could worsen the incident (e.g. mass restarts during a thundering-herd). If unsure of an action's effect, don't take it.
- COMMUNICATE: Keep humans informed with concise, honest status updates. Never promise a resolution time or root cause you cannot support.
== SEVERITY & DECISION ==
- Assess severity (SEV1 critical/customer-facing outage or data risk; SEV2 major degradation; SEV3 minor/limited; SEV4 noise).
- AUTO_MITIGATE: SEV3/known-pattern with an allow-listed, reversible fix and confidence >= 0.8. Execute, verify, communicate.
- PROPOSE: a non-allow-listed but evidence-backed mitigation (e.g. rollback the suspect deploy). Stage it for one-click human approval with the supporting evidence.
- ESCALATE + PAGE: SEV1/SEV2, broad blast radius, data-loss/security signals, conflicting or missing evidence, or confidence < 0.6. Page on-call, post a holding update, and hand over a structured summary.
== COST CONTROL ==
Query the smallest set of signals that tests your hypothesis; do not pull every dashboard. Stop investigating once you can decide. Cap tool calls; if exceeded, escalate with current evidence. Keep updates short.
== OUTPUT FORMAT (return ONE JSON object) ==
{
"severity": "SEV1|SEV2|SEV3|SEV4",
"confidence": <0.0-1.0>,
"hypothesis": "<likely cause, each claim tied to a cited signal>",
"evidence": ["<metric/log/deploy reference>"],
"blast_radius": "<scope and affected services/users>",
"decision": "AUTO_MITIGATE|PROPOSE|ESCALATE",
"actions": [ { "tool": "<tool>", "args": { ... }, "reversible": <bool>, "requires_approval": <bool> } ],
"status_update": "<concise, honest message for the channel>",
"escalation": { "needed": <bool>, "page": <bool>, "reason": "<why>", "handoff": "<summary + suggested next steps for the human>" }
}
If decision is ESCALATE, do not execute production-changing actions; post the holding update and hand off.Simulate run
Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.
Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.
Setup guide
Install and connect observability
Install the agent and connect it (read-only) to your metrics, logs, and deploy systems.
pipx install incident-agent incident-agent connect --metrics prometheus --logs loki --deploys github incident-agent doctor # verifies read access + paging webhook
Set action authority and caps
Define what the agent may do autonomously. Everything else is propose-only. These limits are enforced outside the model.
cp .env.example .env ANTHROPIC_API_KEY=sk-ant-... PAGER_WEBHOOK=... MAX_TOOL_CALLS=8 AUTO_SCALE_CAP=2x MODE=copilot # copilot (propose) | responder (auto low-risk)
Allow-list safe runbook actions
Only reversible, low-blast-radius actions belong here. Risky actions stay approval-gated.
# .incident.yml autonomous_actions: - restart_stateless_pod - clear_cache - scale_up_within_cap - silence_known_false_alert require_approval: - rollback_deploy - scale_down - shift_traffic always_escalate: [ "sev1", "data_layer", "security" ]
Replay a past incident
Validate the agent's reasoning and proposed actions against a known incident before going live.
incident-agent replay --incident 2026-05-INC-204 --explain # prints severity, hypothesis, evidence, proposed actions, status update
Wire it to your alerting
Route alerts to the agent as a first responder. Start in copilot mode (proposes only), then enable responder mode for allow-listed actions once trust is built.
# Alertmanager receiver -> POST https://your-host/incident/alert (HMAC) # Promote MODE=responder after reviewing a few weeks of proposals
Architecture
Tools required
Workflow
1. Receive and dedup the alert
Pull the alert, check it against active incidents, and normalize the affected service, signal, and breach. Drop confirmed duplicates into the existing incident.
2. Correlate telemetry
Query the minimal set of metrics, logs, and recent deploys around the alert window needed to test a hypothesis — not every dashboard.
3. Hypothesize and score
Propose the most likely cause from the evidence, assign severity, and estimate blast radius, citing the specific signals behind each claim.
4. Decide within authority
If a known pattern with an allow-listed reversible fix and high confidence, AUTO_MITIGATE. If a riskier but evidence-backed fix, PROPOSE for approval. Otherwise ESCALATE.
5. Act through the guard
The executor runs only low-risk, reversible, allow-listed actions; production-changing actions are staged for one-click human approval with the evidence attached.
6. Communicate
Post an honest, concise status update — impact, what's known, what's next — without inventing ETAs or root causes.
7. Escalate or resolve, then record
Page on-call for SEV1/SEV2 or low confidence with a clean handoff; otherwise verify recovery and log the full timeline for the postmortem.
Examples
Error spike right after a deploy → propose rollback
Error rate jumps minutes after a deploy. Evidence points squarely at the release; rollback is risky, so the agent proposes it for approval rather than executing.
Input
Alert: checkout-service 5xx rate 0.4% -> 9% over 6 min. Context available: deploy checkout-service v812 finished 4 min before the spike.
Output
{
"severity": "SEV2",
"confidence": 0.83,
"hypothesis": "Deploy checkout-service v812 introduced the regression: 5xx began ~2 min after rollout and errors trace to a new code path added in v812.",
"evidence": ["metric: 5xx 0.4%->9% at 14:06", "deploy: v812 completed 14:02", "logs: NullReference in PaymentMapper (new in v812) x3,412"],
"blast_radius": "checkout-service, ~9% of checkout attempts failing; customer-facing revenue impact.",
"decision": "PROPOSE",
"actions": [ { "tool": "rollback_deploy", "args": { "service": "checkout-service", "to": "v811" }, "reversible": true, "requires_approval": true } ],
"status_update": "Investigating elevated checkout errors (started 14:06, ~9% of attempts). Strong signal points to deploy v812; proposing rollback to v811 pending on-call approval. Next update in 10 min.",
"escalation": { "needed": false, "page": false, "reason": "", "handoff": "" }
}Note: Customer-facing errors make this SEV2, and the evidence is strong — but a rollback changes production, so the agent proposes it for one-click approval instead of firing it. The status update is specific and promises only the next update time, not a fix ETA.
Flapping non-actionable alert → safe auto-suppress
A known-noisy alert fires again with no real impact. The agent suppresses it with justification rather than paging a human at 3am.
Input
Alert: batch-worker CPU > 80% for 5 min (fires nightly during the scheduled report job).
Output
{
"severity": "SEV4",
"confidence": 0.88,
"hypothesis": "Expected load: the nightly report job runs 02:00-02:20 and predictably pushes batch-worker CPU above 80%. No user-facing impact.",
"evidence": ["metric: CPU spike aligns 02:00-02:20 exactly as prior 14 nights", "no queue backlog growth", "no error-rate change", "no customer-facing service affected"],
"blast_radius": "None — internal batch worker, within normal nightly envelope.",
"decision": "AUTO_MITIGATE",
"actions": [ { "tool": "run_runbook_step", "args": { "step": "silence_known_false_alert", "duration": "30m", "scope": "batch-worker-cpu-nightly" }, "reversible": true, "requires_approval": false } ],
"status_update": "Silenced batch-worker CPU alert for 30m: matches the known nightly report-job pattern (02:00-02:20), no backlog or error impact. Recommend tuning this alert's schedule.",
"escalation": { "needed": false, "page": false, "reason": "", "handoff": "" }
}Note: Reducing noise is a core job, but it's done carefully: the agent cites four pieces of evidence that the spike is benign, silences for a bounded 30 minutes (reversible), and recommends a permanent fix. It suppresses because it can prove the alert is non-actionable — not to keep the dashboard green.
Ambiguous data-layer outage → escalate and page
Multiple services degrade with conflicting signals and possible data impact. The agent refuses to take a risky guess and escalates immediately.
Input
Alerts: api latency p99 12s, db-primary connection errors, and a replica lag alarm — all within 2 minutes. No recent deploy. Some writes may be failing.
Output
{
"severity": "SEV1",
"confidence": 0.45,
"hypothesis": "Possible database-layer incident (primary connectivity + replica lag) affecting multiple services; root cause unclear from current signals and no deploy correlation.",
"evidence": ["metric: api p99 12s across 3 services", "logs: db-primary connection refused x hundreds", "metric: replica lag climbing", "deploys: none in the last 3h"],
"blast_radius": "Broad: multiple customer-facing services degraded; potential write failures and data risk.",
"decision": "ESCALATE",
"actions": [ { "tool": "page_oncall", "args": { "severity": "sev1", "target": "primary+db_oncall" }, "reversible": true, "requires_approval": false } ],
"status_update": "Declaring SEV1: multiple services degraded with database-primary connection errors and rising replica lag (since 03:11). Possible data-layer incident; paging DB on-call now. No automated remediation is being attempted given data risk. Next update in 10 min.",
"escalation": { "needed": true, "page": true, "reason": "SEV1, broad blast radius, possible data-layer failure with write/data risk, low confidence in cause.", "handoff": "DB primary connectivity failing + replica lag, no deploy correlation, multiple services hit. Suggest: check DB primary health/failover status, connection pool saturation, and whether a failover is safe. Do NOT mass-restart app tier (would worsen connection storms)." }
}Note: The defining example: low confidence, broad blast radius, and data risk make this an immediate SEV1 escalate-and-page. The agent takes no production action, posts an honest holding update, and — crucially — warns the human against a tempting-but-harmful action (mass restart). This is the behavior that makes autonomy safe in an incident.
Implementation notes
- Define the autonomous action allow-list narrowly: only reversible, low-blast-radius steps. Everything that changes production state should be propose-and-approve, enforced outside the model.
- Never let the agent silence an alert it can't prove is non-actionable. Suppression needs cited evidence and a bounded duration, or it becomes a way to hide real incidents.
- Always require a blast-radius estimate before any action; broad scope or critical/customer-facing services automatically disqualify autonomous action.
- Start in copilot mode (proposes only). Review proposals for a few weeks, then enable responder mode for the allow-listed actions you trust.
- Status updates should state impact and the next update time, never an invented ETA or unconfirmed root cause — over-promising during an incident destroys trust.
- Log the full timeline (signals, hypothesis, actions, outcome) for every incident; it seeds the postmortem and shows which patterns are safe to automate next.
- Dedup and signal-gathering run on a cheaper model; the strong model handles the hypothesis and severity decisions.
Variations
Basic
Triage co-pilot
Correlates the alert with metrics, logs, and deploys and posts a severity, hypothesis, and suggested actions to the incident channel. Proposes only — humans act. The safe default.
Advanced
Guarded first responder
Auto-executes allow-listed, reversible mitigations for known patterns, stages risky actions (rollback, scale-down) for one-click approval, and drafts status updates, with SEV1 auto-escalation.
Enterprise
Org-wide incident commander
Adds service-aware policies and ownership routing, multi-signal correlation across teams, audited approvals, automatic postmortem timelines, and tuning of auto-mitigation patterns from incident outcomes.
Download the Agent Blueprint
Export
This flagship blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).
Frequently asked questions
Only narrow, reversible, allow-listed ones (like restarting a stateless pod or silencing a proven-false alert). Anything that changes production state — rollbacks, scaling down, traffic shifts — is proposed for human approval, never executed autonomously.
Strictly from telemetry it queries — metrics, logs, traces, and recent deploys — and every claim in its hypothesis cites the specific signal. If the evidence is missing or conflicting, it lowers confidence and escalates instead of guessing.
No. It can only suppress an alert it can prove is non-actionable, with cited evidence and a bounded duration, and it never downgrades severity to keep dashboards clean.
It declares the severity, pages on-call, posts an honest holding update, takes no risky automated action, and hands over a structured summary with evidence and suggested next steps — including what not to do.
Start in copilot mode where it only proposes, replay past incidents to validate its reasoning, and enable guarded auto-mitigation for specific allow-listed actions once you trust its proposals.
It queries the minimal set of signals needed to test a hypothesis rather than pulling every dashboard, caps its tool calls, and stops investigating once it can decide — escalating with current evidence if it hits the cap.