Email Tools for Claude: Tool Use With an Agent Mailbox

wpnews.pro

Claude can operate a real mailbox with three tool definitions and about forty lines of glue code.

tools = [
    {
        "name": "read_emails",
        "description": "List recent emails from the agent's inbox. Returns JSON.",
        "input_schema": {
            "type": "object",
            "properties": {
                "limit": {"type": "integer", "default": 10},
                "unread_only": {"type": "boolean", "default": False},
            },
        },
    },
    {
        "name": "search_emails",
        "description": "Search the agent's mailbox for messages matching a query.",
        "input_schema": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"],
        },
    },
    {
        "name": "send_email",
        "description": "Send an email from the agent's own address.",
        "input_schema": {
            "type": "object",
            "properties": {
                "to": {"type": "string"},
                "subject": {"type": "string"},
                "body": {"type": "string"},
            },
            "required": ["to", "subject", "body"],
        },
    },
]

The interesting part isn't the schemas — it's what backs them. Instead of pointing these tools at a human's Gmail over OAuth, you can point them at a Nylas Agent Account: a hosted mailbox the agent owns outright, created with one command on a registered domain:

nylas agent account create agent@yourdomain.com

Agent Accounts are in beta, but they behave like any other grant, which means the same CLI commands and API endpoints work unchanged.

If you hand-roll Gmail OAuth, you're writing roughly 300 lines of token plumbing before the agent does anything useful. Add Microsoft Graph and you're at 600. Add IMAP fallback and you're past 1,000. The LLM agent with tools recipe takes a different route: shell out to the nylas

CLI and let it handle auth, refresh, and provider differences. The implementations are short:

import json, subprocess

def _run(cmd: list[str]) -> str:
    out = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
    return out.stdout if out.returncode == 0 else f"Error: {out.stderr}"

def read_emails(limit: int = 10, unread_only: bool = False) -> str:
    cmd = ["nylas", "email", "list", "--limit", str(limit), "--json"]
    if unread_only:
        cmd.append("--unread")
    return _run(cmd)

def search_emails(query: str) -> str:
    return _run(["nylas", "email", "search", query, "--limit", "5", "--json"])

def send_email(to: str, subject: str, body: str) -> str:
    return _run(["nylas", "email", "send", "--to", to, "--subject", subject,
                 "--body", body, "--yes", "--json"])

Two flags matter more than they look. --yes

skips the interactive "send this?" confirmation — without it, the send command blocks forever waiting for a keypress no agent will ever make. --json

returns structured output the model can actually parse instead of human-formatted text.

Anthropic's tool-use flow is a loop: call the model, execute any tool_use

blocks, feed results back, repeat until the model answers in plain text.

import anthropic

client = anthropic.Anthropic()
DISPATCH = {"read_emails": read_emails, "search_emails": search_emails,
            "send_email": send_email}

messages = [{"role": "user", "content": "Did anyone reply about the contract?"}]
while True:
    resp = client.messages.create(
        model="claude-sonnet-4-5", max_tokens=1024,
        tools=tools, messages=messages,
    )
    messages.append({"role": "assistant", "content": resp.content})
    if resp.stop_reason != "tool_use":
        print(resp.content[0].text)
        break
    results = [
        {"type": "tool_result", "tool_use_id": block.id,
         "content": DISPATCH[block.name](**block.input)}
        for block in resp.content if block.type == "tool_use"
    ]
    messages.append({"role": "user", "content": results})

Claude may issue several tool calls before producing a final answer — search first, read a specific message, then draft a reply. The loop shape doesn't care.

Give the loop a real task and the trace is more interesting than the code. For "Did anyone reply about the contract?", a typical run goes:

tool_use

block: search_emails

with {"query": "contract"}

.read_emails

with {"limit": 10}

to pull recent messages with full context.stop_reason

comes back as end_turn

and Claude answers in plain text: who replied, when, and what they said.No step in that sequence was scripted. The model decided to search before reading, and decided two tool calls were enough. That's the whole appeal of tool use over a hardcoded pipeline — and also why the guardrails below matter.

read_emails

and search_emails

are harmless. send_email

is not, so it deserves three layers of restraint:

timeout=30

on every subprocess.run

call isn't decoration. A CLI command waiting on a prompt or a slow network would otherwise hang the loop forever — exactly the failure mode --yes

exists to prevent, caught a second time.--api-key

explicitly so one tenant's loop can never touch another tenant's mailbox.nylas email list --limit 100

produces a wall of JSON that'll eat your context window. The cookbook's advice: cap limit

aggressively in the schema itself — the default of 10 is deliberate, and 5 is a reasonable floor for list calls. Let error strings through too. Subprocess failures come back as stderr text, and the model is surprisingly good at deciding what to do with "grant expired" versus "rate limited."

One more operational note: the CLI acts on whichever grant is currently active in nylas auth list

. An Agent Account shows up there with Provider: Nylas

, so after creating one, switch to it before starting the loop — otherwise your agent cheerfully sends from your personal address.

Backing these tools with the agent's own address changes the safety story. Replies land in an inbox your application controls. There's no human whose sent folder fills with machine-written mail, and no OAuth consent that breaks when that human leaves the company. The mailbox sends, receives, and threads like any normal account.

There are three ways to wire Claude to this mailbox, and they suit different runtimes:

Route	Best for	What it takes
Subprocess + CLI (this post)	Custom Python loops you fully control	Three wrapper functions, ~40 lines
MCP	Hosts that already speak MCP, like Claude Code
`nylas mcp install --assistant claude-code` — registers 16 email, calendar, and contacts tools, no wrappers
SDK / raw API	Production services
`pip install nylas` , then call `{base_url}/v3/grants/{grant_id}/{resource}` with a Bearer API key

The SDK route trades the CLI's convenience for explicitness: every call carries the grant_id

, errors come back as structured JSON with an error.type

field (unauthorized

, rate_limit_error

, invalid_request_error

), and nothing depends on local CLI state. The autonomous agents quickstart covers the CLI and MCP routes, and the coding agents guide covers the SDK path if you'd rather call the API directly.

Try giving the loop a task that requires multiple turns — "find the latest invoice email and forward a summary to accounting" — and watch which tools Claude chains together. What's the first tool you'd add beyond these three?

source & further reading

dev.to — original article Empirical Failure Modes in Autonomous Agent Operations Stateless MCP for Beginners Real Plugins Need Motors: Skills Should Teach Tools, Not Pretend to Be Them

Email Tools for Claude: Tool Use With an Agent Mailbox

Run your AI side-project on zahid.host