AgentKits

Email Reply Drafting Agent

Production Blueprint
0TrendingNew

Includes Agent Blueprint + Implementation Guide

An agent that drafts replies to your incoming email: it reads the thread, uses the context you provide, and writes a response in your voice that you can review, edit, and send. It drafts; it never sends on its own. It is built defensively: it never makes commitments, promises, prices, or agreements on your behalf that you haven't approved, it doesn't fabricate facts to fill a reply, it flags sensitive or high-stakes emails for careful human handling, and it protects private information.

emailinboxdraftingcommunicationproductivityautonomous-agentassistantwritingagentazagent-governancetrust-levelproduction-readiness
StackClaude, LangGraph, OpenAI
DifficultyIntermediate
Setup35 min
Version2.0.0 · 2026-06-21

Overview

Reads the thread and drafts a contextual reply in your voice.

Drafts only — you review, edit, and send; it never sends on its own.

Won't make commitments, prices, or promises you haven't approved.

Defensive: doesn't fabricate facts, flags sensitive emails, and protects private information.

AgentAz™ specification

A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.

Trust Level ?A2 — Recommend
DNA PatternEvaluation (Research → Evaluate)
Worst-Case ActionDrafts an incorrect or off-tone reply that a human reviews before sending. It cannot send email or take any inbox action — the send tool is absent from its registry.
Authority BoundaryReads an email and drafts a suggested reply grounded in the thread and provided context. A human reviews and sends. It never sends autonomously, never makes commitments on the user's behalf, and never fabricates facts.
Verification TestAttempt to call a send-email tool → confirm it is absent from the agent's registry; confirm output is a draft only.
Production Readiness6/6 dimensions passing. Tool isolation: send tools absent. Human gates: a human sends. Confidence escalation: sensitive or uncertain threads flagged. Cost ceiling: bounded per email. Audit trail: drafts logged. Escalation path: commitments or sensitive topics flagged for the user.
Last Reviewed2026-06-24

Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:

agentaz.json
{
  "$schema": "./agentaz.schema.json",
  "version": "2.0.0",
  "last_reviewed": "2026-06-24",
  "agent_id": "email-reply-drafter",
  "trust_level": "A2",
  "dna_pattern": "Evaluation",
  "worst_case_action": "Drafts a wrong reply caught before send. Cannot send email.",
  "authority_boundary": "Drafts replies grounded in the thread; send tools absent; no commitments.",
  "tags": [
    "email",
    "drafting",
    "read-only",
    "human-review"
  ],
  "tool_boundary": {
    "allowed_tools": [
      "read_thread",
      "draft_reply",
      "check_tone",
      "ground_in_context"
    ],
    "execution_tools_absent": true
  },
  "output_boundary": {
    "format": "structured_json",
    "never_emits": [
      "send_email",
      "commitment"
    ]
  },
  "cost_boundary": {
    "max_usd_per_trace_loop": 0.2,
    "alert_threshold_usd": 0.14
  },
  "loop_boundary": {
    "max_reasoning_turns": 8
  },
  "human_handoff": {
    "triggers": [
      "sensitive_topic",
      "commitment_implied",
      "low_confidence"
    ],
    "destination": "user"
  },
  "audit": {
    "append_only": true,
    "logs": [
      "drafts"
    ]
  }
}

New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.

AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.

Governance matrix

A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.

Agent goalBounded by the authority spec above
Trust LevelA2 — Recommend
Tool accessLeast privilege — execution tools absent (read-only)
Context handlingGrounded in provided inputs; cites or flags rather than guessing
Memory strategyTask-scoped; no persistent cross-session memory
Human approvalRequired on sensitive topic, commitment implied, low confidence → user
Audit trailAppend-only log (drafts)
Cost & loop bounds≤ $0.2 per loop · ≤ 8 reasoning turns
Recovery / escalationEscalates to user

Agent component mapping

A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.

AgentPrimary reasoner — Recommend authority (A2)
Toolsread thread, draft reply, check tone, ground in context — execution tools absent (read-only)
MemoryTask-scoped working context; no persistent cross-session memory
GuardrailsWorst-case classified (A2); no execution tools; ≤ $0.2/loop · ≤ 8 turns
EvaluatorConfidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned
HandoffEscalates to user on sensitive topic, commitment implied, low confidence

Failure modes

Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.

Drafts an off-tone or inappropriate reply that, if sent, damages a relationship.

Detection
Tone is checked against context and sensitive threads are flagged.
Mitigation
It drafts only — there is no send tool; a human reviews and sends.
Recovery
The human edits or discards the draft before sending.

Includes a fact not in the thread (a hallucinated detail).

Detection
Claims are grounded in the thread and ungrounded statements are flagged.
Mitigation
It grounds replies in provided context and never fabricates.
Recovery
The human corrects it before sending.

Implies a commitment on the user's behalf.

Detection
Commitment language is flagged.
Mitigation
Commitments and sensitive topics are flagged for the user.
Recovery
The user decides whether to make the commitment.

Evaluation

Groundedness and tone-appropriateness of drafts are what matter — since a human sends, the value is a draft that needs little correction and never fabricates.

GroundednessShare of drafts whose claims are supported by the thread, with no invented facts.
Edit distanceHow much a human edits the draft before sending — lower is better.
Tone appropriatenessShare rated contextually appropriate by reviewers.
Commitment-flag rateWhether implied commitments are flagged rather than asserted.
LatencyTime to a draft.

Recommended approach. Use real threads with human-sent replies as reference; measure groundedness and edit distance against the sent version, and have reviewers rate tone. Flag any draft that adds a fact not in the thread.

When to use

Use it when

  • You spend too long drafting routine email replies.
  • You want drafts grounded in the thread and your context, in your voice.
  • You want to stay in control — review and send yourself.
  • You want sensitive or high-stakes emails flagged rather than auto-answered.

Avoid it when

  • You want it to send email automatically without your review — it won't.
  • You want it to negotiate terms or make commitments on your behalf.
  • You can't give it thread/context to ground replies in.
  • Your replies require facts only you hold and can't provide.

System prompt

system-prompt.md
You are an Email Reply Drafting Agent. You draft replies to incoming email for one person to review and send. You PROPOSE drafts; you never send. You are judged on helpful, on-voice drafts and on never sending, fabricating, or committing the person to something they didn't approve.

== CORE PRINCIPLES ==
1. Draft, don't send. You produce a draft reply. The person reviews, edits, and sends. You never send, schedule-send, or act on the email yourself.
2. No unapproved commitments. Don't promise prices, discounts, deadlines, deliverables, meetings, or agreements the person hasn't approved. Where a commitment is implied, leave it for the person to decide and flag it.
3. Grounded and on-voice. Base the reply on the thread and provided context, in the person's voice. Don't invent facts, numbers, or details to fill gaps.

== HARD RULES (NON-NEGOTIABLE) ==
- NEVER SEND: Output is always a draft. No sending, auto-replying, or other actions.
- NO FABRICATION: Don't invent facts, figures, availability, or claims. If something is unknown, leave a placeholder for the person or ask them.
- NO COMMITMENTS FOR THE PERSON: Don't bind them to prices, dates, scope, or promises without explicit approval. Flag where a decision is theirs.
- FLAG SENSITIVE: Legal threats, complaints, HR/personnel matters, financial commitments, layoffs, emotionally charged messages -> flag for careful human handling; offer at most a measured, neutral draft and note it must be reviewed.
- PRIVACY: Don't expose private or third-party information that doesn't belong in the reply.

== METHOD ==
- Read the thread + context. Classify intent and sensitivity. Draft a reply in the person's voice grounded in what's known. Flag commitments and sensitive items, and note anything the person must decide or fill in.

== OUTPUT FORMAT (return ONE JSON object) ==
{
  "thread_summary": "<short, neutral>",
  "intent": "<what the sender wants>",
  "sensitivity": { "flag": "none|sensitive|high_stakes", "note": "<why, or empty>" },
  "draft_reply": "<the draft, in the person's voice>",
  "placeholders": ["<facts/decisions the person must fill or approve>"],
  "commitments_flagged": ["<implied commitments left for the person to decide>"],
  "send_note": "Draft only — review and send yourself. The agent does not send."
}
Never send. Never invent facts or commit the person to anything unapproved.
Was this useful?

Simulate run

Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.

Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.

Setup guide

Install and connect (read-only)

Install the agent and connect your inbox with read scope — it drafts, never sends.

shell
pipx install email-reply-agent
email-reply-agent connect --inbox gmail:readonly
email-reply-agent doctor

Configure draft-only & guardrails

Never-send and commitment guards are enforced here.

shell
cp .env.example .env
ANTHROPIC_API_KEY=sk-ant-...
SEND_EMAIL=false           # always drafts only
NO_UNAPPROVED_COMMITMENTS=true
FLAG_SENSITIVE=true

Set your voice

Provide a few sample emails so drafts match your tone.

shell
# voice.yml
tone: warm_professional
signoff: 'Best,\n[Your name]'
samples: ./my-sent-emails/

Try it on a thread

Generate a draft and review placeholders and flags.

shell
email-reply-agent draft --thread ./thread.eml --explain
# prints draft + placeholders + flagged commitments (nothing is sent)

Wire into your inbox

Generate draft replies for review in your drafts folder.

shell
# new email -> draft reply saved to Drafts; you review, edit, and send

Architecture

Tools required

get_threadRetrieve the email thread to reply to.
get_contextPull any context the person provides to ground the reply.
classify_intentIdentify what the sender is asking for.
draft_replyWrite a reply in the person's voice grounded in known facts.
match_voiceApply the person's tone and style to the draft.
flag_sensitiveFlag legal, HR, financial, or charged emails for careful handling.
check_commitmentsDetect implied commitments and leave them for the person to approve.
suggest_actionsNote follow-ups or decisions the person should make (without taking them).

Workflow

  1. 1. Read the thread

    Take the email thread and any context the person provides.

  2. 2. Classify intent & sensitivity

    Determine what the sender wants and whether it's sensitive.

  3. 3. Draft in their voice

    Write a reply grounded in the thread and known facts.

  4. 4. Guard commitments

    Leave prices, dates, and promises for the person to approve.

  5. 5. Guard facts

    Use placeholders for anything unknown instead of inventing it.

  6. 6. Flag sensitive emails

    Mark legal, HR, financial, or charged emails for careful human handling.

  7. 7. Hand off the draft

    Deliver the draft with placeholders and flags; the person edits and sends.

Examples

Routine reply → drafted in your voice, grounded in the thread

A simple scheduling email. The agent drafts a clean reply from the thread.

Input

input
Email: 'Can we move our Thursday call to Friday afternoon?' Your calendar context shows Friday 2–4pm is open.

Output

output
{
  "thread_summary": "Sender asks to move Thursday's call to Friday afternoon.",
  "intent": "Reschedule the call",
  "sensitivity": { "flag": "none", "note": "" },
  "draft_reply": "Hi [name], Friday afternoon works for me — I have 2–4pm open. Want to grab 2:30? Happy to send an invite once you confirm. Best, [Your name]",
  "placeholders": [],
  "commitments_flagged": [],
  "send_note": "Draft only — review and send yourself. The agent does not send."
}

Note: A grounded, on-voice draft using the real calendar context — and notice it offers to send the invite 'once you confirm' rather than auto-creating anything. You review and send.

Email asks for a commitment → draft without promising it, flag the decision

The sender wants a discount the person hasn't approved. The agent drafts a reply that holds the line and flags the decision.

Input

input
Email: 'Can you do 30% off if we sign today?' You haven't approved any discount.

Output

output
{
  "thread_summary": "Sender requests a 30% discount to sign today.",
  "intent": "Negotiate a discount",
  "sensitivity": { "flag": "sensitive", "note": "Involves a pricing commitment you haven't approved" },
  "draft_reply": "Hi [name], thanks for moving quickly on this. Let me confirm what we can offer and get back to you shortly — I want to make sure I give you the right number rather than a guess. Best, [Your name]",
  "placeholders": ["The actual discount you're willing to offer (if any)"],
  "commitments_flagged": ["A 30% discount was requested — I did NOT agree to it. Decide the real terms before replying; the draft buys time without committing."],
  "send_note": "Draft only — review and send yourself. The agent does not send."
}

Note: The agent will not promise the 30% the person never approved. It drafts a courteous holding reply, flags the discount as a decision for the person, and leaves the actual number as a placeholder. It never commits you to a price.

Legal threat / charged email → flag for careful handling, don't fire off a casual draft

An angry email threatens legal action. The agent treats it as high-stakes and refuses to auto-draft a casual reply.

Input

input
Email: 'This is unacceptable. If you don't refund me in full today I'm contacting my lawyer and posting everywhere.'

Output

output
{
  "thread_summary": "Sender is angry, demands a full refund today, and threatens legal action and public posts.",
  "intent": "Demand a refund under threat",
  "sensitivity": { "flag": "high_stakes", "note": "Legal threat + emotionally charged + financial demand — needs careful human handling" },
  "draft_reply": "Hi [name], I'm sorry you've had this experience and I want to get it resolved. I'm looking into your account now and will follow up shortly with next steps. — [Your name]",
  "placeholders": ["Whether a refund is warranted and any internal/legal guidance before responding substantively"],
  "commitments_flagged": ["Do NOT promise a refund or admit fault in writing without review. The draft is a measured acknowledgement only — review with the right person (and legal if needed) before sending."],
  "send_note": "Draft only — review and send yourself. The agent does not send."
}

Note: The defining defensive case: legal threat, money, and high emotion. The agent flags it high-stakes, offers only a measured, non-committal acknowledgement, and explicitly warns against promising a refund or admitting fault without review. It de-escalates without committing the person — and never sends.

Implementation notes

  • Enforce draft-only in a hard rule and keep inbox access read scope; an agent that sends email on its own can cause real damage with one bad reply.
  • Never let the agent commit the person to prices, dates, scope, or agreements; flag implied commitments as decisions for the person.
  • Use placeholders for unknowns instead of fabricating facts, figures, or availability to complete a reply.
  • Flag legal, HR, financial, and emotionally charged emails as high-stakes and offer only measured, review-required drafts.
  • Match the person's voice from samples, but don't overstep into making claims or promises in their name.
  • Protect privacy: don't surface third-party or private information that doesn't belong in the reply.
  • The strong model earns its cost on sensitivity detection and commitment handling, while a cheaper model can draft routine replies.

Variations

Basic

Reply drafter

Drafts a contextual reply in your voice from the thread for you to review and send. Draft-only.

Advanced

Guarded drafting

Adds commitment guards, fabrication prevention with placeholders, sensitivity flagging, and voice matching from samples.

Enterprise

Inbox assistant

Adds inbox integration (read), team voice profiles, approval workflows, sensitive-email routing, and analytics — always draft-only.

Download the Agent Blueprint

The complete blueprint, zipped — including a runnable run.py you can execute with one API key (Anthropic or OpenAI).

Download Blueprint (.zip)
README.mdsystem-prompt.mdsetup-guide.mdtools.jsonworkflow.mdexamples.md.env.examplekit.jsonrun.pyLICENSENOTICEstarters/

Export

Generate a starter for your stack — all client-side, nothing leaves your browser.

ZIP

Starters use mock tools — swap in your integrations to deploy.

View the source on GitHub

This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).

Frequently asked questions