# Scaling to Thousands of Agent Mailboxes

> Source: <https://dev.to/qasim157/scaling-to-thousands-of-agent-mailboxes-pmp>
> Published: 2026-06-15 18:56:23+00:00

Week one: a single test mailbox on a trial domain, provisioned by hand from the dashboard. Week twelve: a fleet of agent mailboxes spread across customer domains, each sending real mail with its own quota and reputation. The API calls are the same at both scales — what changes is everything around them: how you provision, how you share configuration, and how you keep one bad sender from pausing the fleet.

Here's what the path from one to thousands looks like with Nylas [Agent Accounts](https://developer.nylas.com/docs/v3/agent-accounts/provisioning/), which are currently in beta.

There's no OAuth dance to scale around. Creating a mailbox is one POST with `"provider": "nylas"`

— no refresh token, no consent screen — so a fleet provisioner is just iteration:

```
curl --request POST \
  --url "https://api.us.nylas.com/v3/connect/custom" \
  --header "Authorization: Bearer <NYLAS_API_KEY>" \
  --header "Content-Type: application/json" \
  --data '{
    "provider": "nylas",
    "workspace_id": "<WORKSPACE_ID>",
    "settings": {
      "email": "agent-0042@agents.yourcompany.com"
    }
  }'
```

Two scaling-relevant details in that request. First, the domain: one application can manage accounts across any number of registered domains, and the docs explicitly recommend splitting high-volume outbound across multiple domains (`sales-a.yourcompany.com`

, `sales-b.yourcompany.com`

) so reputation damage on one doesn't contaminate the rest. Second, the `workspace_id`

: passing it at creation is how each account picks up its configuration, which brings us to the part that makes fleets manageable.

At fleet scale, per-grant configuration is a non-starter. The model here is indirection: [policies and rules](https://developer.nylas.com/docs/v3/agent-accounts/policies-rules-lists/) attach to *workspaces*, and every account in a workspace inherits them. One policy object — daily send quota, storage cap, retention windows, spam sensitivity — governs a thousand mailboxes, and updating it updates all of them at once.

The recommended carve-up is one workspace per agent archetype: outreach agents get a workspace with high send quotas and strict outbound rules; triage agents get one with aggressive spam filtering and modest quotas. With `auto_group`

enabled, new accounts join the right workspace by email domain automatically, so your provisioner can't misfile them.

Allow/block lists scale the same way. A list holds domains, TLDs, or addresses; rules reference it through `in_list`

; and you can add up to 1,000 items per request. Update the list and every rule referencing it picks up the change immediately — no redeploys, and non-engineers can own the contents.

Send volume isn't the hard limit; deliverability is. The platform tracks each account's rolling bounce and complaint rates against recent send volume, and the [thresholds](https://developer.nylas.com/docs/v3/agent-accounts/send-limits/) are unforgiving at fleet scale:

| Signal | Healthy | Under review | Sending paused |
|---|---|---|---|
| Bounce rate | Under 2% | 5% | 10% |
| Complaint rate | Under 0.1% | 0.1% | 0.5% |

Only hard bounces count — full mailboxes and greylisting don't — and the denominator is a recent representative send volume, not a fixed time window, so the rate stays meaningful at any scale. But a paused account doesn't resume on a timer: clearing a pause requires contacting support with the cause and the fix. Multiply that by a fleet and the lesson is obvious: you want your own circuit breakers tripping *before* the platform's do.

"Under review" is silent to your application — sending continues. A pause is not: outbound send requests start failing with a `400 Bad Request`

carrying text from the underlying infrastructure about the account being suspended or paused. Fleet send paths should recognize that shape, alongside the more mundane `400`

for `"domain is not verified"`

(a provisioning step got skipped) and `429`

for `"rate limit exceeded"`

.

On quotas: the free plan allows 200 messages per account per day, paid plans have no daily cap by default, and a policy can set a stricter per-account quota. At fleet scale that policy quota doubles as a cheap circuit breaker — a runaway agent hits its own ceiling long before it can damage the domain's reputation.

The docs hand you the telemetry for exactly that. Four webhook triggers — `message.transactional.delivered`

, `.bounced`

, `.complaint`

, and `.rejected`

— carry the same events the rates are computed from. Wire them into your own per-account counters and pause your outbound logic when an account trends toward 5% bounces. You'll see the problem in your own metrics before enforcement does.

The other enforcement path is the abuse restriction: a `403`

with `send blocked by abuse restriction`

, applied by the Nylas operations team rather than by a threshold. Restrictions can scope to a single sender address, a sender domain (including its subdomains), an organization, an application, or one specific grant — the most specific match applies. The application-level case is the one that matters for fleets: it stops *every* Agent Account under the application, not just the misbehaving one. Recovery means contacting support with the application ID, the grant ID, and one example error response; once the restriction is lifted, sends succeed on the next attempt with no propagation delay. Fleet code should treat `429`

and `403`

as first-class states, not exceptions to log and forget.

The boring hygiene matters too, because the complaint threshold is tiny — at low volume, a handful of recipients clicking "mark as spam" can put an account under review at 0.1%. Validate recipient addresses before sending, skip anything that has hard-bounced before, honor unsubscribes immediately, use double opt-in for lists you care about, and get DKIM, SPF, and DMARC right on every domain — misconfigured authentication shows up as extra hard bounces from servers that refuse the mail outright.

The receiving side scales more gracefully than you'd expect, because webhook subscriptions are application-level, not per-grant. One `message.created`

subscription covers every mailbox in the fleet; each payload carries the `grant_id`

, so your handler routes by grant. There's no per-mailbox registration step in the provisioning loop and no subscription cleanup in the teardown path.

That makes the consumer architecture the standard one for any high-volume webhook source: receive, enqueue keyed by `grant_id`

, process from the queue. Nothing exotic — the point is that the per-account overhead on the inbound path is zero.

Condensed from the three docs above:

`workspace_id`

in every provisioning call; treat `auto_group`

as a backstop.`message.transactional.*`

triggers and build per-account bounce/complaint counters from day one.`429`

and `403`

as expected states in your send path.If you're sketching a fleet right now, start with the workspace layout — it's the one decision that's annoying to retrofit. What's your target mailbox count, and which threshold worries you more: the 0.5% complaint pause or the application-wide abuse block?
