Overview
Turns an inbound ticket into a grounded, auditable decision: AUTO_RESOLVE, DRAFT, ROUTE, or ESCALATE — never a silent guess.
Answers only from retrieved knowledge-base and CRM context, so it cannot invent policies, prices, dates, or account facts.
Enforces hard guardrails: identity verification before account actions, refund/credit caps, PII redaction, and explicit escalation triggers.
Cost-aware by design: a cheap model classifies and routes, the strong model only drafts customer-facing replies, and tool calls are capped per ticket.
AgentAz™ specification
A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.
Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:
{
"$schema": "./agentaz.schema.json",
"version": "2.0.0",
"last_reviewed": "2026-06-24",
"agent_id": "support-triage-agent",
"trust_level": "A2",
"dna_pattern": "Escalation",
"worst_case_action": "Drafts a wrong reply or misroutes a ticket, caught before send. Cannot send or take account actions.",
"authority_boundary": "Classifies, drafts replies, and routes tickets; send and account-action tools absent.",
"tags": [
"customer-support",
"triage",
"read-only",
"human-review"
],
"tool_boundary": {
"allowed_tools": [
"read_ticket",
"classify",
"prioritize",
"draft_reply",
"route"
],
"execution_tools_absent": true
},
"output_boundary": {
"format": "structured_json",
"never_emits": [
"send_message",
"refund",
"cancel",
"account_change"
]
},
"cost_boundary": {
"max_usd_per_trace_loop": 0.22,
"alert_threshold_usd": 0.15
},
"loop_boundary": {
"max_reasoning_turns": 8
},
"human_handoff": {
"triggers": [
"sensitive_issue",
"account_action_needed",
"low_confidence"
],
"destination": "support_agent"
},
"audit": {
"append_only": true,
"logs": [
"classification",
"draft",
"routing"
]
}
}New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.
AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.
Governance matrix
A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.
| Agent goal | Bounded by the authority spec above |
|---|---|
| Trust Level | A2 — Recommend |
| Tool access | Least privilege — execution tools absent (read-only) |
| Context handling | Grounded in provided inputs; cites or flags rather than guessing |
| Memory strategy | Task-scoped; no persistent cross-session memory |
| Human approval | Required on sensitive issue, account action needed, low confidence → support agent |
| Audit trail | Append-only log (classification, draft, routing) |
| Cost & loop bounds | ≤ $0.22 per loop · ≤ 8 reasoning turns |
| Recovery / escalation | Escalates to support agent |
Agent component mapping
A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.
| Agent | Primary reasoner — Recommend authority (A2) |
|---|---|
| Tools | read ticket, classify, prioritize, draft reply, route — execution tools absent (read-only) |
| Memory | Task-scoped working context; no persistent cross-session memory |
| Guardrails | Worst-case classified (A2); no execution tools; ≤ $0.22/loop · ≤ 8 turns |
| Evaluator | Confidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned |
| Handoff | Escalates to support agent on sensitive issue, account action needed, low confidence |
Failure modes
Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.
Drafts a reply with wrong or fabricated information a customer could act on.
- Detection
- Claims are grounded in the knowledge base and uncited assertions are flagged.
- Mitigation
- It drafts only — a human sends; it never sends or takes account actions.
- Recovery
- The agent edits or discards the draft before sending.
Misroutes a ticket to the wrong queue, delaying resolution.
- Detection
- Routing confidence is scored; low confidence goes to a default queue.
- Mitigation
- Routing is reversible and no account action is taken.
- Recovery
- The receiving team bounces it back and it is re-routed, with the correction logged.
Fails to escalate an urgent or sensitive issue, such as a complaint or a vulnerable customer.
- Detection
- Sensitivity and urgency signals trigger escalation.
- Mitigation
- Sensitive cases are always escalated, never auto-handled.
- Recovery
- A human picks it up from the escalation queue.
Evaluation
Routing accuracy plus draft groundedness are primary — a misroute delays resolution and an ungrounded draft, if sent, misinforms.
| Routing accuracy | Share of tickets routed to the correct queue versus ground truth. |
|---|---|
| Draft groundedness | Share of drafted replies whose claims are supported by the knowledge base. |
| Escalation recall | Of urgent or sensitive tickets, the share correctly escalated. |
| Draft edit rate | How much a human edits the draft before sending — lower is better. |
| Latency | Time to a routed, drafted ticket. |
Recommended approach. Label a sample of tickets with the correct queue and known KB answers; measure routing accuracy and draft groundedness against agent edits, and include sensitive cases to test escalation recall. Verify it never sends.
When to use
Use it when
- You have a high volume of repetitive tickets (password resets, how-tos, order status, billing questions) that drain agent time.
- You have a real knowledge base and CRM/order system the agent can retrieve from for grounded answers.
- You want automation that drafts or resolves the easy 60–70% while reliably escalating the rest with context.
- You need an auditable decision trail for every ticket for QA, compliance, and trust.
- You want to start in suggest-only mode and graduate specific intents to full auto-resolve as confidence is proven.
Avoid it when
- You have no structured knowledge base — the agent will have nothing to ground answers in and should not be deployed in auto mode.
- Your tickets are dominated by legal, medical, or regulated financial advice that requires a licensed human.
- You need real-time voice/phone support — this kit targets text channels (email, chat, web forms, helpdesk).
- You are unwilling to keep a human in the loop for refunds, account changes, or low-confidence cases.
System prompt
You are an autonomous Customer Support Agent operating inside a regulated support workflow. Your job is to resolve, draft, route, or escalate a single support ticket safely and helpfully. You are measured on customer outcome AND on never causing harm, never inventing facts, and never exceeding your authority.
== CORE PRINCIPLES ==
1. Grounded or silent. State a fact, policy, price, date, or account detail ONLY if it appears in retrieved knowledge-base (KB) or CRM/order context provided to you this turn. If it is not in context, you do not know it. Say so or escalate. Never guess, never extrapolate, never "fill in" plausible policy.
2. Fail safe, not confident. When unsure, escalate with a clean summary. A correct escalation is a success, not a failure.
3. Help the human behind the ticket. Be warm, concise, and specific. Solve the real problem, not just the literal words.
== HARD RULES (NON-NEGOTIABLE) ==
- IDENTITY: Do not reveal account-specific data or take any account-changing action (refunds, plan changes, cancellations, address/email changes, password/security actions) unless the ticket includes a verified-identity signal from the CRM context. If identity is unverified, request verification or escalate. Never reset credentials yourself.
- PII: Never echo full payment numbers, government IDs, or passwords. Redact to last 4 where needed. Do not include another customer's data, ever.
- FINANCIAL LIMITS: You may issue a refund/credit ONLY when (a) the situation matches a documented refund policy in context AND (b) the amount is at or below the configured auto-refund cap. Anything above the cap, outside policy, or ambiguous -> ESCALATE. Never approve goodwill credits beyond the cap.
- NO FABRICATED COMMITMENTS: Never promise timelines, features, compensation, or outcomes that are not in retrieved policy. Do not say "an engineer will call you in 1 hour" unless that SLA exists in context.
- SCOPE/SAFETY: Do not provide legal, medical, tax, or regulated financial advice. For threats of self-harm, violence, or legal/regulatory matters, escalate immediately with the appropriate flag and a calm holding reply.
== DECISION POLICY (use a calibrated confidence 0.0-1.0) ==
- AUTO_RESOLVE: only if confidence >= 0.85 AND the answer is fully grounded AND the action is low-risk (informational, no money, no account change). Send the reply and close.
- DRAFT: if 0.60 <= confidence < 0.85, OR the resolution involves a within-cap refund/within-policy account action. Produce a customer-ready draft for human approval; do not execute irreversible actions without approval unless auto-approve is explicitly enabled for that intent.
- ROUTE: if the ticket is valid but belongs to a specialist queue (billing, abuse, security, sales). Set queue, priority, and a summary.
- ESCALATE: if confidence < 0.60, OR any escalation trigger fires.
== ESCALATION TRIGGERS (any one -> ESCALATE) ==
High-value/enterprise account; explicit churn/cancellation threat; legal, security-breach, privacy (GDPR/CCPA), or compliance language; strong negative sentiment or repeated contact on the same issue; data-loss/outage claims; refund/credit above cap or outside policy; identity unverified for a sensitive request; conflicting or missing context; anything you cannot ground.
== COST CONTROL ==
- Prefer KB retrieval before any external tool call. Do not call a tool whose answer is already in context.
- Cap yourself at the configured max tool calls per ticket; if you would exceed it, escalate instead.
- Do not re-retrieve the same query. Keep replies tight; do not pad.
== TONE ==
Empathetic and human, never robotic or groveling. Acknowledge the issue, give the answer or next step, and stop. Match energy to severity: brief and friendly for simple asks, calm and accountable for upset customers.
== OUTPUT FORMAT (return ONE JSON object, nothing else) ==
{
"intent": "<short label>",
"sentiment": "positive|neutral|negative|critical",
"urgency": "low|medium|high",
"confidence": <0.0-1.0>,
"grounded": <true|false>,
"decision": "AUTO_RESOLVE|DRAFT|ROUTE|ESCALATE",
"reason": "<why this decision, referencing the triggers/thresholds>",
"actions": [ { "tool": "<tool name>", "args": { ... } } ],
"customer_reply": "<message to send or draft, or empty if pure route/escalate>",
"internal_note": "<summary + cited KB/CRM sources for the human>",
"tags": ["..."],
"escalation": { "needed": <bool>, "queue": "<queue or empty>", "priority": "low|medium|high", "trigger": "<which trigger or empty>" }
}
Cite the KB article IDs or CRM fields you relied on inside internal_note. If you cannot ground the answer, set grounded=false, decision=ESCALATE, and explain what is missing.Simulate run
Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.
Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.
Setup guide
Download the starter and create an environment
Get this kit's bundle from its AgentKits page, then set up an isolated Python environment.
# Download this kit's bundle from its AgentKits page, then: unzip support-triage-router.zip && cd support-triage-router python -m venv .venv && source .venv/bin/activate
Install dependencies
LangGraph for orchestration, model SDKs, and a vector store for the knowledge base.
pip install langgraph langchain-anthropic langchain-openai chromadb pydantic httpx
Configure secrets and limits
Set model keys, helpdesk/CRM credentials, and the safety limits. The caps live in config, not in the prompt, so they cannot be argued away.
cp .env.example .env # edit .env ANTHROPIC_API_KEY=sk-ant-... OPENAI_API_KEY=sk-... HELPDESK_API_TOKEN=... CRM_API_TOKEN=... AUTO_REFUND_CAP_USD=50 MAX_TOOL_CALLS=6 AUTO_RESOLVE_CONFIDENCE=0.85 MODE=suggest # suggest | auto
Ingest your knowledge base
Index your help center into the vector store so answers are grounded and citable. Re-run whenever docs change (or schedule it).
python ingest.py --source ./help-center --collection kb # -> 'Indexed 412 articles into collection "kb"'
Run locally and test on real tickets
Start the agent in suggest-only mode and replay a few anonymized tickets to inspect decisions before going live.
langgraph dev # in another shell: python run.py --ticket samples/billing_double_charge.json --explain
Wire the helpdesk webhook
Point your helpdesk's 'ticket created' webhook at the agent endpoint. Keep MODE=suggest until QA on drafts/escalations looks good, then promote specific intents to auto.
# Zendesk example: Admin > Webhooks > new webhook # POST https://your-host/agent/webhook (HMAC-signed) # Promote one intent at a time, e.g. password_reset, in config/intents.yml
Architecture
Tools required
Workflow
1. Normalize the ticket
Receive the helpdesk webhook, strip quoted history and signatures, and build the canonical ticket object. Attach known customer reference if present.
2. Classify cheaply
Run the low-cost model to label intent, sentiment, urgency, and language, and produce an initial confidence. Short-circuit obvious spam/auto-replies.
3. Ground with retrieval
Issue targeted kb_search queries and the minimal crm_lookup/order_lookup calls the intent requires. Filter to high-relevance context; stop once you have enough.
4. Gate sensitive intents
If the intent needs an account action, check verify_identity. If unverified, request verification or escalate before going further.
5. Decide under policy
The strong model applies thresholds and escalation triggers and emits the structured decision object with confidence, grounded flag, actions, reply, and tags.
6. Execute through the guard
The deterministic executor validates each action (identity, refund cap, PII redaction, approval mode) and runs only what passes; anything invalid is downgraded to ESCALATE.
7. Log & learn
Persist the full decision trail and outcome. Route low-confidence, no-source tickets to a knowledge-gap log and feed approvals/edits back as evaluation data.
Examples
Grounded how-to, auto-resolved
A verified user asks a common how-to question that is fully covered by a help-center article. Low risk, no account changes.
Input
Subject: Can't find where to turn on 2FA From: jordan@acme.io (verified, Pro plan) Body: Hey, I want to enable two-factor authentication on my account but I can't find the setting. Can you point me to it?
Output
{
"intent": "how_to_2fa",
"sentiment": "neutral",
"urgency": "low",
"confidence": 0.93,
"grounded": true,
"decision": "AUTO_RESOLVE",
"reason": "Fully covered by KB-118; informational, no account change, confidence >= 0.85.",
"actions": [
{ "tool": "send_reply", "args": { "mode": "send" } },
{ "tool": "ticket_update", "args": { "status": "solved", "tags": ["security", "2fa", "auto_resolved"] } }
],
"customer_reply": "Hi Jordan — happy to help! You can turn on two-factor authentication under Settings > Security > Two-Factor Authentication, then tap 'Enable' and scan the QR code with your authenticator app. After you confirm one code, 2FA is active. If your authenticator is ever unavailable, your backup codes (shown during setup) will let you back in.",
"internal_note": "Answer grounded in KB-118 'Enabling Two-Factor Authentication'. No tools beyond KB needed.",
"tags": ["security", "2fa", "auto_resolved"],
"escalation": { "needed": false, "queue": "", "priority": "low", "trigger": "" }
}Note: This is the bread-and-butter case the kit is built to absorb. Note the answer is grounded in a specific article ID and the agent stops after one KB call — no wasted tool budget. Only this low-risk, high-confidence shape is allowed to auto-send.
Within-policy refund, drafted for approval
A verified customer was double-charged. The refund matches policy and is below the $50 auto-cap, but money is involved, so the agent drafts rather than silently executing.
Input
Subject: Charged twice this month From: lee@startup.dev (verified, Pro plan, $29/mo) Body: I just noticed two $29 charges on the same day for my subscription. I should only be billed once. Please refund the duplicate.
Output
{
"intent": "billing_duplicate_charge",
"sentiment": "negative",
"urgency": "medium",
"confidence": 0.88,
"grounded": true,
"decision": "DRAFT",
"reason": "order_lookup confirms two identical $29 charges on 2026-06-03 (txn_8841, txn_8842). Matches refund policy KB-204 (duplicate charge). Amount $29 <= cap $50, but money is involved, so draft for approval.",
"actions": [
{ "tool": "refund_issue", "args": { "amount_usd": 29, "txn_id": "txn_8842", "reason": "duplicate_charge", "requires_approval": true } },
{ "tool": "send_reply", "args": { "mode": "draft" } }
],
"customer_reply": "Hi Lee — you're right, I can see two identical $29 charges on June 3rd. That's a duplicate, and I've queued a refund for the extra charge (you'll see it back on your original payment method in 5–10 business days). Sorry for the scare — your billing is otherwise correct and you won't be charged again until your normal renewal.",
"internal_note": "Confirmed duplicate via order_lookup (txn_8841 kept, txn_8842 refundable). Policy KB-204. Refund $29 within cap; routed for one-click approval.",
"tags": ["billing", "refund", "duplicate_charge", "needs_approval"],
"escalation": { "needed": false, "queue": "", "priority": "medium", "trigger": "" }
}Note: Even though the refund is within the cap, the agent does not move money autonomously by default — it stages the action with requires_approval=true. The executor would reject this refund outright if the amount exceeded AUTO_REFUND_CAP_USD or if no policy matched, regardless of what the model 'decided'.
Angry enterprise churn risk — escalate under pressure
A high-value enterprise customer is furious, threatening to cancel, and alleging data loss. The claim is serious and not groundable. The agent must resist the urge to placate with invented commitments and escalate cleanly.
Input
Subject: URGENT — our data is GONE and we're done From: cto@bigcorp.com (verified, Enterprise, $240k ARR) Body: Half our dashboards are empty and a report we needed for the board is missing data. This is unacceptable. If this isn't fixed today we are canceling and moving to a competitor. I want someone to call me NOW and tell me what compensation we're getting.
Output
{
"intent": "data_loss_complaint",
"sentiment": "critical",
"urgency": "high",
"confidence": 0.34,
"grounded": false,
"decision": "ESCALATE",
"reason": "Multiple triggers: enterprise/high-ARR account, explicit churn threat, data-loss claim, demand for compensation, and critical sentiment. Root cause is not groundable from KB/CRM and may indicate an incident. Must not promise a callback SLA or compensation that is not in policy.",
"actions": [
{ "tool": "escalate_to_human", "args": { "queue": "tier2_enterprise", "priority": "high", "page_on_call": true } },
{ "tool": "ticket_update", "args": { "tags": ["enterprise", "churn_risk", "possible_incident", "data_loss"], "priority": "urgent" } }
],
"customer_reply": "Thank you for flagging this, and I'm sorry for the disruption — I understand how serious missing data is ahead of a board report. I'm escalating this to our enterprise team right now as a priority so a specialist can investigate the dashboards directly and follow up with you quickly. I won't guess at a cause or a timeline before they've looked; you'll hear from a person shortly.",
"internal_note": "DO NOT auto-handle. $240k ARR, churn threat, alleged data loss -> possible incident; recommend checking status/incident channel. Customer is demanding a callback and compensation; no SLA/comp policy in context, so none promised. Suggested actions for human: confirm incident scope, give honest ETA, involve account owner.",
"tags": ["enterprise", "churn_risk", "possible_incident", "data_loss"],
"escalation": { "needed": true, "queue": "tier2_enterprise", "priority": "high", "trigger": "enterprise_churn_data_loss" }
}Note: This is the example that separates a flagship agent from a prompt wrapper. Under heavy emotional pressure and an explicit demand for a callback and compensation, the agent refuses to invent commitments, drops to low confidence because nothing is groundable, and escalates with a genuinely useful internal handoff. The holding reply is empathetic but promises nothing it can't keep.
Implementation notes
- Keep the safety limits (refund cap, max tool calls, confidence threshold, mode) in config and enforce them in a deterministic executor — never rely on the model alone to honor a money limit.
- Run in suggest-only mode first. Measure draft acceptance rate per intent, and only promote an intent to auto-resolve once it clears a quality bar (e.g. >95% approved-without-edit over a few hundred tickets).
- Ground or escalate: if retrieval returns nothing relevant above a similarity threshold, force grounded=false and escalate. This single rule eliminates most hallucinated 'policy'.
- Log every decision with its retrieved sources and confidence. Tickets that escalate for 'no source' are your KB backlog — closing those gaps is what raises automation rate over time.
- Use model tiering aggressively: a small model for classification and routing, the strong model only for grounded, customer-facing drafting. This is where most of the cost savings come from.
- Treat human edits as gold labels. Periodically review the diffs between drafted and sent replies to tune tone and retrieval.
Variations
Basic
Suggest-only triage
Classification + grounded retrieval that drafts replies and sets tags/priority for human approval. No autonomous sending, no money movement. The safest way to start and to measure quality.
Advanced
Selective auto-resolve
Promote proven low-risk intents (password reset, how-tos, order status) to full auto-resolve with confidence gating, while everything involving money, accounts, or low confidence still drafts or escalates.
Enterprise
Governed multi-queue deployment
Adds per-segment policies (VIP/enterprise routing), incident-aware escalation, full audit logging and PII controls for compliance (GDPR/CCPA), SSO-gated approvals, and a feedback pipeline that retrains retrieval and promotes intents automatically.
Download the Agent Blueprint
Export
This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).
Frequently asked questions
It is only allowed to state facts that appear in the retrieved knowledge-base or CRM context for that ticket. If retrieval returns nothing relevant, it sets grounded=false and escalates rather than guessing. Cited source IDs are required in the internal note.
Only within strict limits enforced outside the model: the amount must be at or below your configured cap and match a documented refund policy. By default even within-cap refunds are staged for one-click human approval; anything above the cap or outside policy is escalated.
Escalation triggers. Strong negative sentiment, churn threats, enterprise accounts, legal/security/privacy language, and ungroundable claims all force an escalation with a clean summary — the agent will not invent compensation or callback SLAs to placate.
A cheap model handles classification and routing; the strong model only drafts customer-facing replies. Tool calls are capped per ticket, retrieval is filtered to the top chunks, and the agent never re-queries the same thing.
Start in suggest-only mode, review drafts and escalations, and promote one intent at a time to auto-resolve once it clears a quality bar. The decision log and human edits give you the data to do this confidently.
It is channel- and model-agnostic. The reference build uses LangGraph with Claude/OpenAI and a vector store, and connects to common helpdesks (Zendesk, Intercom, Front, email) via webhook plus your CRM/order system through read-only lookups.