Claude can operate a real mailbox with three tool definitions and about forty lines of glue code.
tools = [
{
"name": "read_emails",
"description": "List recent emails from the agent's inbox. Returns JSON.",
"input_schema": {
"type": "object",
"properties": {
"limit": {"type": "integer", "default": 10},
"unread_only": {"type": "boolean", "default": False},
},
},
},
{
"name": "search_emails",
"description": "Search the agent's mailbox for messages matching a query.",
"input_schema": {
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"],
},
},
{
"name": "send_email",
"description": "Send an email from the agent's own address.",
"input_schema": {
"type": "object",
"properties": {
"to": {"type": "string"},
"subject": {"type": "string"},
"body": {"type": "string"},
},
"required": ["to", "subject", "body"],
},
},
]
The interesting part isn't the schemas — it's what backs them. Instead of pointing these tools at a human's Gmail over OAuth, you can point them at a Nylas Agent Account: a hosted mailbox the agent owns outright, created with one command on a registered domain:
nylas agent account create agent@yourdomain.com
Agent Accounts are in beta, but they behave like any other grant, which means the same CLI commands and API endpoints work unchanged.
If you hand-roll Gmail OAuth, you're writing roughly 300 lines of token plumbing before the agent does anything useful. Add Microsoft Graph and you're at 600. Add IMAP fallback and you're past 1,000. The LLM agent with tools recipe takes a different route: shell out to the nylas
CLI and let it handle auth, refresh, and provider differences. The implementations are short:
import json, subprocess
def _run(cmd: list[str]) -> str:
out = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
return out.stdout if out.returncode == 0 else f"Error: {out.stderr}"
def read_emails(limit: int = 10, unread_only: bool = False) -> str:
cmd = ["nylas", "email", "list", "--limit", str(limit), "--json"]
if unread_only:
cmd.append("--unread")
return _run(cmd)
def search_emails(query: str) -> str:
return _run(["nylas", "email", "search", query, "--limit", "5", "--json"])
def send_email(to: str, subject: str, body: str) -> str:
return _run(["nylas", "email", "send", "--to", to, "--subject", subject,
"--body", body, "--yes", "--json"])
Two flags matter more than they look. --yes
skips the interactive "send this?" confirmation — without it, the send command blocks forever waiting for a keypress no agent will ever make. --json
returns structured output the model can actually parse instead of human-formatted text.
Anthropic's tool-use flow is a loop: call the model, execute any tool_use
blocks, feed results back, repeat until the model answers in plain text.
import anthropic
client = anthropic.Anthropic()
DISPATCH = {"read_emails": read_emails, "search_emails": search_emails,
"send_email": send_email}
messages = [{"role": "user", "content": "Did anyone reply about the contract?"}]
while True:
resp = client.messages.create(
model="claude-sonnet-4-5", max_tokens=1024,
tools=tools, messages=messages,
)
messages.append({"role": "assistant", "content": resp.content})
if resp.stop_reason != "tool_use":
print(resp.content[0].text)
break
results = [
{"type": "tool_result", "tool_use_id": block.id,
"content": DISPATCH[block.name](**block.input)}
for block in resp.content if block.type == "tool_use"
]
messages.append({"role": "user", "content": results})
Claude may issue several tool calls before producing a final answer — search first, read a specific message, then draft a reply. The loop shape doesn't care.
Give the loop a real task and the trace is more interesting than the code. For "Did anyone reply about the contract?", a typical run goes:
tool_use
block: search_emails
with {"query": "contract"}
.read_emails
with {"limit": 10}
to pull recent messages with full context.stop_reason
comes back as end_turn
and Claude answers in plain text: who replied, when, and what they said.No step in that sequence was scripted. The model decided to search before reading, and decided two tool calls were enough. That's the whole appeal of tool use over a hardcoded pipeline — and also why the guardrails below matter.
read_emails
and search_emails
are harmless. send_email
is not, so it deserves three layers of restraint:
timeout=30
on every subprocess.run
call isn't decoration. A CLI command waiting on a prompt or a slow network would otherwise hang the loop forever — exactly the failure mode --yes
exists to prevent, caught a second time.--api-key
explicitly so one tenant's loop can never touch another tenant's mailbox.nylas email list --limit 100
produces a wall of JSON that'll eat your context window. The cookbook's advice: cap limit
aggressively in the schema itself — the default of 10 is deliberate, and 5 is a reasonable floor for list calls. Let error strings through too. Subprocess failures come back as stderr text, and the model is surprisingly good at deciding what to do with "grant expired" versus "rate limited."
One more operational note: the CLI acts on whichever grant is currently active in nylas auth list
. An Agent Account shows up there with Provider: Nylas
, so after creating one, switch to it before starting the loop — otherwise your agent cheerfully sends from your personal address.
Backing these tools with the agent's own address changes the safety story. Replies land in an inbox your application controls. There's no human whose sent folder fills with machine-written mail, and no OAuth consent that breaks when that human leaves the company. The mailbox sends, receives, and threads like any normal account.
There are three ways to wire Claude to this mailbox, and they suit different runtimes:
| Route | Best for | What it takes |
|---|---|---|
| Subprocess + CLI (this post) | Custom Python loops you fully control | Three wrapper functions, ~40 lines |
| MCP | Hosts that already speak MCP, like Claude Code | |
nylas mcp install --assistant claude-code — registers 16 email, calendar, and contacts tools, no wrappers |
||
| SDK / raw API | Production services | |
pip install nylas , then call {base_url}/v3/grants/{grant_id}/{resource} with a Bearer API key |
The SDK route trades the CLI's convenience for explicitness: every call carries the grant_id
, errors come back as structured JSON with an error.type
field (unauthorized
, rate_limit_error
, invalid_request_error
), and nothing depends on local CLI state. The autonomous agents quickstart covers the CLI and MCP routes, and the coding agents guide covers the SDK path if you'd rather call the API directly.
Try giving the loop a task that requires multiple turns — "find the latest invoice email and forward a summary to accounting" — and watch which tools Claude chains together. What's the first tool you'd add beyond these three?