Stop Your Agent From Replying Twice: Dedup Patterns

wpnews.pro

Ever watched an email agent reply to the same message twice? The recipient gets two near-identical responses seconds apart, screenshots them, and your carefully engineered assistant suddenly looks like a script with a stutter. Worse: under real load, this isn't a freak event. It's the default outcome if you haven't designed against it.

The double-reply problem has three distinct causes, and each one needs its own fix. Let's walk through them.

First cause: webhook redelivery. Nylas — like most webhook providers — guarantees at-least-once delivery. If your endpoint doesn't return a 200

fast enough, or a transient network blip eats the response, the same message.created

notification shows up again. Process both, send two replies.

Second: concurrent workers. Your handler probably runs on multiple instances — Lambda invocations, ECS tasks, worker processes. Two of them can pick up the same notification at nearly the same instant and both start generating a reply.

Third: shared inboxes. Two agents (or an agent and a human) watching the same mailbox can both decide a message is theirs to answer. This one isn't a duplicate event at all — it's a coordination problem, and it's the hardest to patch at the application layer.

Track which message IDs you've processed, and check before doing anything else:

app.post("/webhooks/nylas", async (req, res) => {
  res.status(200).end();

  const event = req.body;
  if (event.type !== "message.created") return;

  const messageId = event.data.object.id;

  // Atomic check-and-set. If the key exists, bail.
  const alreadyProcessed = await db.processedMessages.setIfAbsent(messageId, {
    receivedAt: Date.now(),
  });

  if (alreadyProcessed) return;

  await handleMessage(event.data.object);
});

The check-and-set must be atomic. In Redis that's SET messageId 1 NX EX 86400

; in Postgres it's INSERT ... ON CONFLICT DO NOTHING

with a row-count check. Give the record a TTL of 24 hours — long enough that a webhook redelivered hours later still gets caught, short enough that the table doesn't grow forever.

Dedup alone isn't enough. Two workers can race past the check-and-set within the same millisecond window. A per-thread lock closes that gap:

async function handleMessage(msg) {
  // Acquire a lock on this thread. If another worker holds it, skip.
  const lock = await db.acquireLock(`thread:${msg.thread_id}`, {
    ttlMs: 30_000, // release after 30 seconds if the worker crashes
  });

  if (!lock.acquired) return; // someone else has it

  try {
    // Double-check: has a reply already gone out since this message arrived?
    const thread = await nylas.threads.find({
      identifier: AGENT_GRANT_ID,
      threadId: msg.thread_id,
    });

    const latestMessage = thread.data.latestDraftOrMessage;
    if (latestMessage && latestMessage.from[0]?.email === AGENT_EMAIL) {
      return; // a prior worker or retry already replied
    }

    await generateAndSendReply(msg);
  } finally {
    await lock.release();
  }
}

The double-check inside the lock is the part people skip, and it matters: between the webhook arriving and the lock being acquired, another worker might have already finished the job. Fetching the thread and inspecting its latest message — if it's from the agent's own address, bail — catches exactly that window. The 30-second TTL on the lock is your crash insurance; a worker that dies mid-reply shouldn't hold the thread hostage forever.

Dedup catches the same event delivered twice. Locking catches the same event processed simultaneously. You need both; neither substitutes for the other.

The cleanest answer to the coordination problem is to delete it. Agent Accounts — Nylas-hosted mailboxes for AI agents, currently in beta — make this cheap: each agent gets its own address, its own inbox, and its own webhook stream.

sales-agent@agents.yourcompany.com

does outbound prospectingsupport-agent@agents.yourcompany.com

handles inbound supportscheduling@agents.yourcompany.com

coordinates meetingsEach handler filters to its own grant — if (msg.grant_id !== MY_GRANT_ID) return;

— and no two agents ever see the same message. When a human needs oversight, give them read-only IMAP access instead of a second automated writer.

Here's the failure mode the first three fixes don't cover: a logic bug. Outbound messages fire message.created

too, so an agent that accidentally responds to its own mail enters a loop — reply triggers webhook triggers reply. A per-thread send cap is the circuit breaker:

const recentSends = await db.recentAgentSends(threadId, { withinMinutes: 5 });

if (recentSends >= 3) {
  await escalateToHuman(threadId, "reply rate limit hit");
  return;
}

Three sends on one thread in 5 minutes means something's wrong; escalate instead of sending. And always filter the agent's own address at the top of every handler — it's one line and it prevents the whole class of self-reply loops.

For agent mailboxes, rules can pre-sort inbound mail server-side. Route automated notifications to a separate folder, block spam at the SMTP layer, and have your handler skip folders the agent shouldn't answer:

curl --request POST \
  --url "https://api.us.nylas.com/v3/rules" \
  --header "Authorization: Bearer $NYLAS_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "match": [{ "field": "from.domain", "operator": "equals", "value": "noreply.example.com" }],
    "actions": [{ "action": "assign_to_folder", "value": "notifications" }]
  }'

Less noise reaching your handler means fewer chances for conflicting logic to fire.

Failure mode	What it looks like	The fix
Webhook redelivery	Same event, delivered twice	Atomic check-and-set on the message ID
Concurrent workers	Same event, processed simultaneously	Per-thread lock + double-check inside it
Shared inbox	Two writers, one mailbox	One agent per inbox; IMAP for human oversight
Self-reply loop	Agent answers its own outbound mail	Sender filter + per-thread send cap

Notice that no single fix covers another row. Dedup does nothing against a race between workers; locking does nothing against a logic bug that replies to the agent's own messages. If you're operating at any meaningful volume, you want all four.

How long should dedup records live? 24–48 hours. Long enough to catch a webhook redelivered hours later, short enough that the table stays small. A webhook for a message ID older than that is almost certainly a bug, not a redelivery.

Do outbound sends really fire webhooks? Yes — message.created

fires for messages the agent sends, not just messages it receives. That's why the self-reply filter (if (sender === AGENT_EMAIL) return;

) belongs at the top of every handler, before any other logic runs.

Can I skip the lock if my handler is single-instance? Today, maybe. But "single-instance" is a deployment detail, not an architecture guarantee — the day someone scales the worker pool to two, the race appears. The lock costs one Redis call; the double reply costs a customer screenshot.

A single-threaded test will never surface any of this. The only reliable verification is synthetic load with concurrent webhook deliveries — fire the same payload at your endpoint from multiple connections and confirm exactly one reply goes out. And when you skip a duplicate, log it; silent skips turn debugging into archaeology.

The full recipe — including the thread double-check and TTL guidance — is in the duplicate-reply prevention guide, with the upstream reply-handling loop next door.

What's your dedup story? If you've shipped a webhook consumer that survived a redelivery storm — or didn't — I'd like to hear what finally fixed it.

source & further reading

dev.to — original article Coding agents: A silent hook babysits, a loud hook teaches Coding agents: Your skill bodies are fine, your descriptions are broken One Repo Became Three — Quietly, Then Publicl

Stop Your Agent From Replying Twice: Dedup Patterns

Run your AI side-project on zahid.host