8:30 AM: you kick off the inbox-zero loop while the coffee brews. 8:35 AM: yesterday's 50 unread messages are sorted into four buckets, replies are drafted for the ones that matter, the noise is archived, and you've approved the lot with a few keystrokes. The rest of the day, the inbox only contains new mail.
That's the daily rhythm an email agent can sustain β and the daily part is the whole trick. Inbox zero has never been hard to reach once; it's hard to keep, because the maintenance is boring. Boring, repetitive classification is exactly what agents are for. And it gets more interesting when the inbox in question belongs to the robot itself.
Pull a manageable batch of unread β 50 is the sweet spot; below 20 you waste setup overhead, above 50 you blow the LLM context budget and the approval review drags:
nylas email list --unread --limit 50 --json
Classify each message into one of four buckets:
| Bucket | Reply window |
|---|---|
| Urgent | |
| Within hours β client issue, manager request | |
| Action required | |
| Today β meeting follow-up, review | |
| FYI | |
| No response β newsletter, status update | |
| Archive | |
| Now β marketing, automated alerts |
The agent shows a summary table; you audit it before anything else happens. A misfiled "Action" caught at this step becomes a correct draft instead of a missed commitment. Then the agent drafts replies for Urgent and Action items, you approve, edit, or skip each one, and approved drafts ship while Archive items get archived.
One rule is non-negotiable: the agent never sends without explicit approval. The agent drafts; the human ships. Even when a draft is obviously fine, the click is the difference between "AI wrote this" and "I wrote this with help."
First runs will misclassify. Encode the corrections as standing rules in the agent's prompt, and each pass gets faster:
always_fyi:
- "from: sales@*"
- "from: noreply@*"
- "subject: ^\\[GitHub\\]"
always_urgent:
- "from: *@board.example.com"
- "subject: \\b(p0|incident|outage)\\b"
If you'd rather drive it as a script than a chat session, the whole loop fits in a dozen lines of orchestration:
unread = fetch_unread(limit=50)
buckets = classify_all(unread) # 4-bucket categorization
print_summary_table(buckets)
for msg in input_corrections(buckets): # interactive correction
pass
drafts = [draft_reply(m) for m in buckets["URGENT"] + buckets["ACTION"]]
for draft in interactive_approval(drafts): # one-by-one Y/N/edit
if draft.approved:
send(draft)
for msg in buckets["ARCHIVE"]:
archive(msg)
The interactive correction and approval steps are what separate this from a cron-driven triage bot β same bucket model, different trust contract. Some teams eventually add a fifth "delegate" bucket that auto-CCs a teammate; do that after the four-bucket version is bedded in, not before.
Run this loop against a human's inbox and you've built an assistant. Run it against an inbox the agent owns and you've built something more autonomous: a support@ or triage@ address that is the agent's workspace, not borrowed territory.
That's what Agent Accounts provide β hosted mailboxes (currently in beta) created by API call, each with a real address and a grant_id
that works with the standard Messages, Threads, Folders, and Drafts endpoints. Every mailbox arrives with six system folders β inbox
, sent
, drafts
, trash
, junk
, archive
β and you can create custom folders alongside them for whatever taxonomy your buckets need. (System folder names are reserved.)
Folder hygiene stops being a human courtesy and becomes agent state: inbox
is the work queue, archive
is processed-no-action, a custom escalations
folder is the handoff point to humans. The mailbox structure is the state machine.
Here's an efficiency the borrowed-inbox version can't touch: on an agent-owned mailbox, rules run at the SMTP layer, before the message.created
webhook ever fires. Inbound mail flows through policy checks on arrival β block
rejects at the SMTP stage, mark_as_spam
routes to junk
, assign_to_folder
files it β and rule evaluations are logged for audit.
So the deterministic 80% of your Archive bucket (newsletters, automated alerts, known senders) never costs an LLM token. The model only reasons over mail that survived the filter. Cheaper, faster, and the agent reacts to less noise β which also means fewer chances to misfire on a mailer-daemon reply.
A few mailbox facts worth designing around:
message.created.truncated
with the body omitted β always fetch the full message before classifying.And drafts deserve a special mention: full CRUD lives at /v3/grants/{grant_id}/drafts
, and sending an existing draft behaves exactly like a normal send. That's the primitive that makes the approval gate clean β the agent's output is a reviewable draft object, not an irreversible send.
Can server-side rules replace the LLM entirely? For the Archive bucket, mostly yes β known senders and automated alerts are deterministic. For Urgent vs. Action, no: "is this a client issue or a status update?" needs judgment, and that's the part worth spending tokens on. The split is the design: rules below, model above.
What happens to skipped drafts? They stay in the queue β and because drafts are real objects in the drafts
folder with full CRUD, "revisit at the end of the session" is a list call, not a memory exercise. Nothing is lost by deferring a decision.
The interactive flow is written up in the inbox-zero recipe, and the mailbox mechanics β folders, lifecycle, deliverability signals β in how mailboxes work.
Try this tomorrow morning: run the four-bucket loop once against 50 unread messages and time it. If it beats your usual triage, the follow-up question gets interesting β which of your team's shared inboxes deserves to become a robot's inbox first?