Build a ChatGPT-Style Email Plugin

A developer built a ChatGPT-style email plugin using Nylas APIs and function calling with large language models. The system defines three tools—list_messages, get_message, and send_email—with a human-in-the-loop gate for sending. By trimming API responses to essential fields, the developer reduced payload size by 80%.

Here's how the story usually goes. Saturday afternoon, you wire a language model to a mailbox for the first time. You type "summarize my unread mail" and watch it actually happen — the model scans, picks out the thread from your landlord, nails the summary. Magic. Sunday morning, drunk on possibility, you add a send capability. Sunday evening, you're reading a transcript where a newsletter's footer text nearly convinced the model to forward something it shouldn't, and you quietly remove the send tool until you understand what just happened. The gap between Saturday and Sunday is the actual engineering of an AI email assistant. The model can't touch a mailbox on its own — you give it tools: small server-side functions that wrap email endpoints, run when the model asks, and hand results back. The model decides; your code acts. Getting that boundary right is the whole game. The pattern works identically for ChatGPT, Claude, or any model with function calling — a tool is a JSON schema with a name , description , and typed parameters . Define three: list messages , get message , send email . The descriptions are what the model reasons over, so write them like instructions, and keep parameter counts low — models pick correctly from 3 to 5 fields far more reliably than from 15. { "name": "send email", "description": "Send an email from the user's mailbox. Requires human approval first.", "parameters": { "type": "object", "properties": { "to": { "type": "string", "description": "Recipient email address" }, "subject": { "type": "string" }, "body": { "type": "string", "description": "HTML or plain text body" } }, "required": "to", "subject", "body" } } All three tools map to two endpoints: list and get both hit GET /v3/grants/{grant id}/messages , send hits POST /v3/grants/{grant id}/messages/send . One dispatcher handles the lot: python def run tool name, args, grant id : base = f"{NYLAS API}/grants/{grant id}/messages" if name == "list messages": params = {"limit": min args.get "limit", 50 , 200 } if args.get "unread" : params "unread" = "true" return requests.get base, headers=HEADERS, params=params .json if name == "get message": return requests.get f"{base}/{args 'message id' }", headers=HEADERS .json if name == "send email": if not args.get "approved" : human-in-the-loop gate return {"status": "pending approval"} payload = {"to": {"email": args "to" } , "subject": args "subject" , "body": args "body" } return requests.post f"{base}/send", headers=HEADERS, json=payload .json The grant id identifies whose mailbox you're operating — a connected Gmail or Outlook account, or an Agent Account https://developer.nylas.com/docs/v3/agent-accounts/ a hosted mailbox the assistant owns outright, currently in beta if you'd rather the bot have its own address. Same endpoints either way; sends work across 6 providers — Google, Microsoft, Yahoo, iCloud, IMAP, and EWS — with zero SMTP setup. Token cost scales with what you feed the model, and raw API responses are bloated for this purpose — a list response carries dozens of fields per message. Triage needs four: python def slim message : return { "id": message "id" , "from": message "from" 0 "email" , "subject": message "subject" , "snippet": message.get "snippet", "" :200 , } Trimming a 50-message list this way cuts the payload by about 80% versus full message objects. The flow becomes: list slim → model picks the IDs that matter → get message for those few full bodies → summarize. List returns 50 messages by default with a 200 maximum, so cap the limit and never dump a 200-message inbox into one prompt. Trace "summarize my unread mail and flag anything urgent" through the machinery: list messages with {"unread": true, "limit": 50} . GET /v3/grants/{grant id}/messages , slims each result to four fields, and returns the trimmed list as the tool output. get message calls. send email ... and gets {"status": "pending approval"} back, because nothing leaves without a human click.Two details to notice. The model never saw an API key, a raw header, or a message it didn't ask for. And the expensive step — full bodies — happened for 3 messages, not 50. That's the shape of every well-built turn: broad and cheap, then narrow and complete. When the human does approve, the confirmation is just the same tool call with the gate flag set: draft = {"to": "ada@example.com", "subject": "Re: Q2 plan", "body": "Thanks Ada, 9am PT works. I'll send an invite."} Show the draft to the user, get an explicit yes, THEN: draft "approved" = True run tool "send email", draft, grant id Back to that send tool. Four practices cover the failure modes that cause real incidents: pending approval until a person sees the full draft and signs off. This one gate neutralizes both hallucinated sends and injected ones, at the cost of one click — and one wrong send costs far more than that click.The complete recipe — full dispatcher, both provider wrappings, the security checklist — is in the ChatGPT email plugin guide https://developer.nylas.com/docs/cookbook/agents/chatgpt-email-plugin/ . When you outgrow single-turn chat, the email triage agent https://developer.nylas.com/docs/cookbook/agents/email-triage-agent/ runs the same tools on a cron, and inbox zero with an agent https://developer.nylas.com/docs/cookbook/agents/inbox-zero/ keeps a human approving every action. Next step: implement just list messages and get message tonight — read-only, no send tool at all — and ask the model to triage your real inbox. You'll learn more from twenty minutes of watching its tool calls than from any post about it, this one included.