{"slug": "build-the-reply-loop-receive-think-respond", "title": "Build the Reply Loop: Receive, Think, Respond", "summary": "A developer building an email agent using Nylas Agent Accounts details the receive-think-respond loop, highlighting edge cases such as the `message.created.truncated` webhook for bodies over 1 MB and the need to avoid replying to the agent's own outbound messages. The implementation uses a state machine keyed by thread_id to route replies, with the LLM returning a nextStep value to advance the conversation.", "body_md": "About 1 MB. That's the body-size threshold where the `message.created`\n\nwebhook quietly changes shape — the trigger becomes `message.created.truncated`\n\nand the body is omitted entirely. If your email agent reads bodies straight off webhook payloads, it works fine for months and then silently drops the one reply that contained a forwarded contract. That detail is a good preview of this whole topic: the receive-think-respond loop is conceptually simple, and every interesting bug lives in the edges.\n\nLet's wire the loop properly, using a [Nylas Agent Account](https://developer.nylas.com/docs/v3/agent-accounts/) (in beta) as the agent's mailbox.\n\nA `message.created`\n\nwebhook fires when mail arrives. Treat it as notification only:\n\n``` js\napp.post(\"/webhooks/nylas\", async (req, res) => {\n  res.status(200).end(); // ack fast, work async\n\n  const event = req.body;\n  if (event.type !== \"message.created\") return;\n\n  const msg = event.data.object;\n  if (msg.grant_id !== AGENT_GRANT_ID) return;\n\n  // Outbound fires message.created too -- don't reply to yourself.\n  if (msg.from?.[0]?.email === AGENT_EMAIL) return;\n\n  const conversation = await db.conversations.findByThreadId(msg.thread_id);\n  conversation\n    ? await continueConversation(msg, conversation)\n    : await triageNewInbound(msg);\n});\n```\n\nThree load-bearing lines in there. The grant check keeps other accounts' traffic out. The `from`\n\ncheck matters because **the webhook fires for outbound mail too** — skip it and your agent replies to its own replies, forever. And the `thread_id`\n\nlookup is how a reply gets recognized as a reply: messages are grouped into threads using the `In-Reply-To`\n\nand `References`\n\nheaders, so if your agent sent the original message, the inbound reply lands on a thread you already have state for. No header parsing on your side.\n\nThe payload carries summary fields — `subject`\n\n, `from`\n\n, `snippet`\n\n. Before the model decides anything, fetch the real data:\n\n``` js\nconst fullMessage = await nylas.messages.find({\n  identifier: AGENT_GRANT_ID,\n  messageId: msg.id,\n});\n\nconst thread = await nylas.threads.find({\n  identifier: AGENT_GRANT_ID,\n  threadId: msg.thread_id,\n});\n// thread.data.messageIds -> fetch each, sort by date, build transcript\n```\n\nAn LLM answering \"sounds good, let's do Thursday\" needs to know what was proposed — the full thread is the conversation memory. For long threads, you don't need every message verbatim: summarize the early turns and pass the last 3–4 in full. Same context, fraction of the tokens.\n\nYour own state machine supplies the other half of the context. A conversation record keyed by `thread_id`\n\ntracks a `step`\n\nfield, and the handler routes on it before any model call happens:\n\n```\nasync function routeReply(message, history, context) {\n  switch (context.step) {\n    case \"awaiting_confirmation\":\n      // The agent proposed something and is waiting for a yes/no.\n      await handleConfirmation(message, history, context);\n      break;\n    case \"awaiting_info\":\n      // The agent asked a question and needs the answer.\n      await handleInfoResponse(message, history, context);\n      break;\n    case \"closed\":\n      // The conversation was resolved but the person wrote back.\n      await handleReopenedThread(message, history, context);\n      break;\n    default:\n      // Unknown state -- log and escalate.\n      await escalateToHuman(message, context);\n  }\n}\n```\n\nA \"yes\" means something different depending on what the agent asked, and the `default`\n\nbranch matters: an unknown state should escalate, not improvise. The other useful trick from the [multi-turn recipe](https://developer.nylas.com/docs/cookbook/agent-accounts/multi-turn-conversations/): have the LLM return a `nextStep`\n\nvalue along with the reply text, so the model itself advances the state machine instead of your code guessing where the conversation went.\n\n``` js\nconst sent = await nylas.messages.send({\n  identifier: AGENT_GRANT_ID,\n  requestBody: {\n    replyToMessageId: msg.id,\n    to: fullMessage.data.from,\n    subject: `Re: ${fullMessage.data.subject}`,\n    body: replyBody,\n  },\n});\n```\n\nPassing `reply_to_message_id`\n\nmakes the platform set `In-Reply-To`\n\nand `References`\n\non the outbound message, so the recipient's mail client renders a threaded reply instead of a disconnected new email. Skip it and every reply starts a new thread — the fastest way to make an agent feel broken to the human on the other end. The mechanics are covered in depth in the [handle-replies recipe](https://developer.nylas.com/docs/cookbook/agent-accounts/handle-replies/).\n\nAfter sending, update the conversation record: bump the turn count, set the next `step`\n\n, stamp `lastActivityAt`\n\n.\n\n**Self-reply loops.** Covered above, but it's the #1 footgun. One missing `from`\n\ncheck equals an infinite conversation with yourself.\n\n**Duplicate replies.** Webhook redelivery and concurrent workers will both re-trigger your handler — at any volume, not just at scale. Without dedup and locking, the same inbound message generates two LLM calls and two replies. Treat idempotency as a launch requirement, not a hardening task.\n\n**Rapid-fire corrections.** Humans send \"let's do Thursday\" and then \"actually, Friday\" eleven seconds apart. A 30–60 second cooldown before responding lets you batch consecutive inbound messages into one coherent reply instead of answering each individually.\n\n**Runaway conversations.** An unbounded loop is a token sink and a risk. The [multi-turn recipe](https://developer.nylas.com/docs/cookbook/agent-accounts/multi-turn-conversations/) bakes a `maxTurns`\n\ncap into the conversation record — 10 is a reasonable default — and escalates to a human when it's hit.\n\n**Zombie threads.** Someone replies to a conversation that went quiet weeks ago. Decide the behavior up front; a sane rule is escalating anything dormant past 168 hours (one week) rather than letting the agent auto-resume with stale context.\n\n**Multiple repliers on one thread.** CC someone and you've invited a second voice into the conversation — two people might both reply to the same agent message. Process each inbound independently, and check whether the agent has already responded since the last inbound before generating another reply.\n\n**Lost state.** The gap between turns can be days, so conversation records live in Postgres, Redis with AOF, DynamoDB — anything that survives restarts. In-memory state means every deploy lobotomizes your agent mid-conversation.\n\nNot every conversation ends with the agent's final word, and the exits deserve code too. Escalation is a state change plus a notification:\n\n```\nasync function escalate(conversation, reason) {\n  await db.conversations.update(conversation.threadId, {\n    step: \"escalated\",\n    metadata: { ...conversation.metadata, escalationReason: reason },\n  });\n  await notifyHumanOperator({\n    threadId: conversation.threadId,\n    contact: conversation.contactEmail,\n    reason,\n  });\n}\n```\n\nCompletion is the same move with `step: \"completed\"`\n\n— and it's not just bookkeeping. When the prospect books the meeting or the support question gets answered, marking the record done changes how the *next* inbound on that thread routes: it hits the `closed`\n\nbranch of your router instead of generating an out-of-context continuation. The state machine's exits are what make its middle states trustworthy.\n\nOne last note on the front door: verify the `X-Nylas-Signature`\n\nheader before your handler does anything. An unverified webhook endpoint is an API that lets anyone on the internet make your agent send email.\n\nBuild the loop in this order: webhook handler with the three guard checks → thread fetching → a hardcoded reply (no LLM yet) → verify threading works in a real mail client → then swap in the model. Wiring the LLM first is the classic mistake; you end up debugging prompt quality and webhook delivery simultaneously.\n\nWhich failure mode bit you first? Mine's universal enough that I'll guess: the agent replied to itself.", "url": "https://wpnews.pro/news/build-the-reply-loop-receive-think-respond", "canonical_source": "https://dev.to/qasim157/build-the-reply-loop-receive-think-respond-2me7", "published_at": "2026-06-16 17:05:58+00:00", "updated_at": "2026-06-16 17:17:37.130513+00:00", "lang": "en", "topics": ["developer-tools", "ai-agents", "natural-language-processing", "large-language-models"], "entities": ["Nylas", "Nylas Agent Accounts"], "alternates": {"html": "https://wpnews.pro/news/build-the-reply-loop-receive-think-respond", "markdown": "https://wpnews.pro/news/build-the-reply-loop-receive-think-respond.md", "text": "https://wpnews.pro/news/build-the-reply-loop-receive-think-respond.txt", "jsonld": "https://wpnews.pro/news/build-the-reply-loop-receive-think-respond.jsonld"}}