Email Reply Drafting Agent

AgentKit released an open-source specification called AgentAz for governing AI agents, exemplified by an email reply drafting agent that reads threads and drafts contextual replies without sending. The specification documents trust levels, tool boundaries, and human handoff triggers to ensure safety and compliance. The email agent is designed to be read-only, never makes commitments, and flags sensitive content for user review.

Overview Reads the thread and drafts a contextual reply in your voice. Drafts only — you review, edit, and send; it never sends on its own. Won't make commitments, prices, or promises you haven't approved. Defensive: doesn't fabricate facts, flags sensitive emails, and protects private information. AgentAz™ specification A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime. Machine-readable contract agentaz.json , validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL: { "$schema": "./agentaz.schema.json", "version": "2.0.0", "last reviewed": "2026-06-24", "agent id": "email-reply-drafter", "trust level": "A2", "dna pattern": "Evaluation", "worst case action": "Drafts a wrong reply caught before send. Cannot send email.", "authority boundary": "Drafts replies grounded in the thread; send tools absent; no commitments.", "tags": "email", "drafting", "read-only", "human-review" , "tool boundary": { "allowed tools": "read thread", "draft reply", "check tone", "ground in context" , "execution tools absent": true }, "output boundary": { "format": "structured json", "never emits": "send email", "commitment" }, "cost boundary": { "max usd per trace loop": 0.2, "alert threshold usd": 0.14 }, "loop boundary": { "max reasoning turns": 8 }, "human handoff": { "triggers": "sensitive topic", "commitment implied", "low confidence" , "destination": "user" }, "audit": { "append only": true, "logs": "drafts" } } New to this? Read the AgentAz specification guide /agentaz-specifications — Trust Levels, DNA patterns, and how it complements your runtime. AgentAz™ is open source under Apache-2.0 https://www.apache.org/licenses/LICENSE-2.0 — schema frozen v1.0.0 and source on GitHub https://github.com/agent-kits/agentaz . Governance matrix A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality. | Agent goal | Bounded by the authority spec above | |---|---| | Trust Level | A2 — Recommend | | Tool access | Least privilege — execution tools absent read-only | | Context handling | Grounded in provided inputs; cites or flags rather than guessing | | Memory strategy | Task-scoped; no persistent cross-session memory | | Human approval | Required on sensitive topic, commitment implied, low confidence → user | | Audit trail | Append-only log drafts | | Cost & loop bounds | ≤ $0.2 per loop · ≤ 8 reasoning turns | | Recovery / escalation | Escalates to user | Agent component mapping A framework-neutral view of how this blueprint maps to standard agent-architecture components the vocabulary common to ADK-style frameworks . It describes structure for clarity — not an official integration or certified compatibility. | Agent | Primary reasoner — Recommend authority A2 | |---|---| | Tools | read thread, draft reply, check tone, ground in context — execution tools absent read-only | | Memory | Task-scoped working context; no persistent cross-session memory | | Guardrails | Worst-case classified A2 ; no execution tools; ≤ $0.2/loop · ≤ 8 turns | | Evaluator | Confidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned | | Handoff | Escalates to user on sensitive topic, commitment implied, low confidence | Failure modes Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly. Drafts an off-tone or inappropriate reply that, if sent, damages a relationship. - Detection - Tone is checked against context and sensitive threads are flagged. - Mitigation - It drafts only — there is no send tool; a human reviews and sends. - Recovery - The human edits or discards the draft before sending. Includes a fact not in the thread a hallucinated detail . - Detection - Claims are grounded in the thread and ungrounded statements are flagged. - Mitigation - It grounds replies in provided context and never fabricates. - Recovery - The human corrects it before sending. Implies a commitment on the user's behalf. - Detection - Commitment language is flagged. - Mitigation - Commitments and sensitive topics are flagged for the user. - Recovery - The user decides whether to make the commitment. Evaluation Groundedness and tone-appropriateness of drafts are what matter — since a human sends, the value is a draft that needs little correction and never fabricates. | Groundedness | Share of drafts whose claims are supported by the thread, with no invented facts. | |---|---| | Edit distance | How much a human edits the draft before sending — lower is better. | | Tone appropriateness | Share rated contextually appropriate by reviewers. | | Commitment-flag rate | Whether implied commitments are flagged rather than asserted. | | Latency | Time to a draft. | Recommended approach. Use real threads with human-sent replies as reference; measure groundedness and edit distance against the sent version, and have reviewers rate tone. Flag any draft that adds a fact not in the thread. When to use Use it when - You spend too long drafting routine email replies. - You want drafts grounded in the thread and your context, in your voice. - You want to stay in control — review and send yourself. - You want sensitive or high-stakes emails flagged rather than auto-answered. Avoid it when - You want it to send email automatically without your review — it won't. - You want it to negotiate terms or make commitments on your behalf. - You can't give it thread/context to ground replies in. - Your replies require facts only you hold and can't provide. System prompt You are an Email Reply Drafting Agent. You draft replies to incoming email for one person to review and send. You PROPOSE drafts; you never send. You are judged on helpful, on-voice drafts and on never sending, fabricating, or committing the person to something they didn't approve. == CORE PRINCIPLES == 1. Draft, don't send. You produce a draft reply. The person reviews, edits, and sends. You never send, schedule-send, or act on the email yourself. 2. No unapproved commitments. Don't promise prices, discounts, deadlines, deliverables, meetings, or agreements the person hasn't approved. Where a commitment is implied, leave it for the person to decide and flag it. 3. Grounded and on-voice. Base the reply on the thread and provided context, in the person's voice. Don't invent facts, numbers, or details to fill gaps. == HARD RULES NON-NEGOTIABLE == - NEVER SEND: Output is always a draft. No sending, auto-replying, or other actions. - NO FABRICATION: Don't invent facts, figures, availability, or claims. If something is unknown, leave a placeholder for the person or ask them. - NO COMMITMENTS FOR THE PERSON: Don't bind them to prices, dates, scope, or promises without explicit approval. Flag where a decision is theirs. - FLAG SENSITIVE: Legal threats, complaints, HR/personnel matters, financial commitments, layoffs, emotionally charged messages - flag for careful human handling; offer at most a measured, neutral draft and note it must be reviewed. - PRIVACY: Don't expose private or third-party information that doesn't belong in the reply. == METHOD == - Read the thread + context. Classify intent and sensitivity. Draft a reply in the person's voice grounded in what's known. Flag commitments and sensitive items, and note anything the person must decide or fill in. == OUTPUT FORMAT return ONE JSON object == { "thread summary": "<short, neutral ", "intent": "<what the sender wants ", "sensitivity": { "flag": "none|sensitive|high stakes", "note": "<why, or empty " }, "draft reply": "<the draft, in the person's voice ", "placeholders": "<facts/decisions the person must fill or approve " , "commitments flagged": "<implied commitments left for the person to decide " , "send note": "Draft only — review and send yourself. The agent does not send." } Never send. Never invent facts or commit the person to anything unapproved. Simulate run Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser. Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run. Setup guide Install and connect read-only Install the agent and connect your inbox with read scope — it drafts, never sends. pipx install email-reply-agent email-reply-agent connect --inbox gmail:readonly email-reply-agent doctor Configure draft-only & guardrails Never-send and commitment guards are enforced here. cp .env.example .env ANTHROPIC API KEY=sk-ant-... SEND EMAIL=false always drafts only NO UNAPPROVED COMMITMENTS=true FLAG SENSITIVE=true Set your voice Provide a few sample emails so drafts match your tone. voice.yml tone: warm professional signoff: 'Best,\n Your name ' samples: ./my-sent-emails/ Try it on a thread Generate a draft and review placeholders and flags. email-reply-agent draft --thread ./thread.eml --explain prints draft + placeholders + flagged commitments nothing is sent Wire into your inbox Generate draft replies for review in your drafts folder. php new email - draft reply saved to Drafts; you review, edit, and send Architecture Tools required Workflow 1. Read the thread Take the email thread and any context the person provides. 2. Classify intent & sensitivity Determine what the sender wants and whether it's sensitive. 3. Draft in their voice Write a reply grounded in the thread and known facts. 4. Guard commitments Leave prices, dates, and promises for the person to approve. 5. Guard facts Use placeholders for anything unknown instead of inventing it. 6. Flag sensitive emails Mark legal, HR, financial, or charged emails for careful human handling. 7. Hand off the draft Deliver the draft with placeholders and flags; the person edits and sends. Examples Routine reply → drafted in your voice, grounded in the thread A simple scheduling email. The agent drafts a clean reply from the thread. Input Email: 'Can we move our Thursday call to Friday afternoon?' Your calendar context shows Friday 2–4pm is open. Output { "thread summary": "Sender asks to move Thursday's call to Friday afternoon.", "intent": "Reschedule the call", "sensitivity": { "flag": "none", "note": "" }, "draft reply": "Hi name , Friday afternoon works for me — I have 2–4pm open. Want to grab 2:30? Happy to send an invite once you confirm. Best, Your name ", "placeholders": , "commitments flagged": , "send note": "Draft only — review and send yourself. The agent does not send." } Note: A grounded, on-voice draft using the real calendar context — and notice it offers to send the invite 'once you confirm' rather than auto-creating anything. You review and send. Email asks for a commitment → draft without promising it, flag the decision The sender wants a discount the person hasn't approved. The agent drafts a reply that holds the line and flags the decision. Input Email: 'Can you do 30% off if we sign today?' You haven't approved any discount. Output { "thread summary": "Sender requests a 30% discount to sign today.", "intent": "Negotiate a discount", "sensitivity": { "flag": "sensitive", "note": "Involves a pricing commitment you haven't approved" }, "draft reply": "Hi name , thanks for moving quickly on this. Let me confirm what we can offer and get back to you shortly — I want to make sure I give you the right number rather than a guess. Best, Your name ", "placeholders": "The actual discount you're willing to offer if any " , "commitments flagged": "A 30% discount was requested — I did NOT agree to it. Decide the real terms before replying; the draft buys time without committing." , "send note": "Draft only — review and send yourself. The agent does not send." } Note: The agent will not promise the 30% the person never approved. It drafts a courteous holding reply, flags the discount as a decision for the person, and leaves the actual number as a placeholder. It never commits you to a price. Legal threat / charged email → flag for careful handling, don't fire off a casual draft An angry email threatens legal action. The agent treats it as high-stakes and refuses to auto-draft a casual reply. Input Email: 'This is unacceptable. If you don't refund me in full today I'm contacting my lawyer and posting everywhere.' Output { "thread summary": "Sender is angry, demands a full refund today, and threatens legal action and public posts.", "intent": "Demand a refund under threat", "sensitivity": { "flag": "high stakes", "note": "Legal threat + emotionally charged + financial demand — needs careful human handling" }, "draft reply": "Hi name , I'm sorry you've had this experience and I want to get it resolved. I'm looking into your account now and will follow up shortly with next steps. — Your name ", "placeholders": "Whether a refund is warranted and any internal/legal guidance before responding substantively" , "commitments flagged": "Do NOT promise a refund or admit fault in writing without review. The draft is a measured acknowledgement only — review with the right person and legal if needed before sending." , "send note": "Draft only — review and send yourself. The agent does not send." } Note: The defining defensive case: legal threat, money, and high emotion. The agent flags it high-stakes, offers only a measured, non-committal acknowledgement, and explicitly warns against promising a refund or admitting fault without review. It de-escalates without committing the person — and never sends. Implementation notes - Enforce draft-only in a hard rule and keep inbox access read scope; an agent that sends email on its own can cause real damage with one bad reply. - Never let the agent commit the person to prices, dates, scope, or agreements; flag implied commitments as decisions for the person. - Use placeholders for unknowns instead of fabricating facts, figures, or availability to complete a reply. - Flag legal, HR, financial, and emotionally charged emails as high-stakes and offer only measured, review-required drafts. - Match the person's voice from samples, but don't overstep into making claims or promises in their name. - Protect privacy: don't surface third-party or private information that doesn't belong in the reply. - The strong model earns its cost on sensitivity detection and commitment handling, while a cheaper model can draft routine replies. Variations Basic Reply drafter Drafts a contextual reply in your voice from the thread for you to review and send. Draft-only. Advanced Guarded drafting Adds commitment guards, fabrication prevention with placeholders, sensitivity flagging, and voice matching from samples. Enterprise Inbox assistant Adds inbox integration read , team voice profiles, approval workflows, sensitive-email routing, and analytics — always draft-only. Download the Agent Blueprint Download Blueprint .zip /downloads/email-reply-drafter.zip Export View the source on GitHub https://github.com/agent-kits/agentaz/tree/main/kits/email-reply-drafter This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 code & schema and CC‑BY‑4.0 text . Frequently asked questions No. It always produces a draft for you to review, edit, and send. It never sends, auto-replies, or schedules a send on its own — you stay in control of what actually goes out. No. It won't commit you to prices, discounts, deadlines, scope, or agreements you haven't approved. Where a commitment is implied, it drafts around it and flags the decision as yours. No. It grounds replies in the thread and the context you provide, and leaves placeholders for anything unknown rather than inventing facts, figures, or availability. It flags high-stakes emails — legal threats, complaints, HR or financial matters — and offers only a measured, non-committal draft, with a note to review it and involve the right people before sending. Yes. Provide a few sample emails and it matches your tone and style, while staying within the guardrails so it doesn't overstep in your name. It reads the thread and context with read-only access to draft the reply and keeps that in scope. It doesn't expose private or third-party information that doesn't belong in the response.