# Cache Invalidation for AI Consumers: Keeping Agent-Facing Endpoints Fresh Without Busting the CDN Edge

> Source: <https://blog.r-lopes.com/posts/2026-06-06-cache-invalidation-for-ai-consumers-keeping-agent-facing-en>
> Published: 2026-06-06 14:00:00+00:00

## The Problem

Agent-facing endpoints — the `/api/*`

routes that LLM tool calls, retrieval pipelines, and autonomous agents hit dozens of times per task — sit awkwardly between two cache models. Human-facing HTML can tolerate a 60-second stale window because a person won't notice; an agent reasoning over a chain of five tool calls absolutely will, because stale data in call #2 poisons every downstream inference. The naive fix — `Cache-Control: no-store`

everywhere — collapses your edge hit ratio and pushes every agent request to origin, which is the failure mode CDNs were built to prevent [Source 2](#source-2).

## The Shape

``` js
// app/api/agent/[resource]/route.ts
import { NextRequest, NextResponse } from 'next/server'
import { revalidateTag } from 'next/cache'

export const dynamic = 'force-dynamic'

const FRESH = 30
const SWR = 300

export async function GET(req: NextRequest, { params }: { params: { resource: string } }) {
  const tag = `agent:${params.resource}`
  const etag = await computeEtag(params.resource)

  if (req.headers.get('if-none-match') === etag) {
    return new NextResponse(null, {
      status: 304,
      headers: {
        'Cache-Control': `public, max-age=${FRESH}, stale-while-revalidate=${SWR}`,
        'ETag': etag,
        'Vary': 'Accept, X-Agent-Consumer',
        'X-Cache-Tag': tag,
      },
    })
  }

  const data = await loadResource(params.resource, { tag })

  return NextResponse.json(data, {
    headers: {
      'Cache-Control': `public, max-age=${FRESH}, stale-while-revalidate=${SWR}`,
      'ETag': etag,
      'Vary': 'Accept, X-Agent-Consumer',
      'X-Cache-Tag': tag,
      'X-Deployment-Id': process.env.NEXT_DEPLOYMENT_ID ?? 'dev',
    },
  })
}

// app/api/invalidate/route.ts
export async function POST(req: NextRequest) {
  const secret = req.headers.get('x-invalidate-secret')
  if (secret !== process.env.INVALIDATE_SECRET) {
    return new NextResponse('forbidden', { status: 403 })
  }
  const { tags } = (await req.json()) as { tags: string[] }
  for (const t of tags) revalidateTag(t)

  await fetch('https://api.cloudflare.com/client/v4/zones/' + process.env.CF_ZONE + '/purge_cache', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.CF_TOKEN}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ tags }),
  })

  return NextResponse.json({ purged: tags })
}

async function computeEtag(resource: string): Promise<string> {
  const row = await db.query('SELECT updated_at, version FROM resources WHERE id = $1', [resource])
  return `"${row.version}-${row.updated_at.getTime()}"`
}
```

## How It Works

The contract has three moving parts: a short `max-age`

paired with a long `stale-while-revalidate`

, a content-addressed `ETag`

, and tag-keyed purges from the writer side. `max-age=30, stale-while-revalidate=300`

tells the edge to serve cached bytes for 30 seconds with zero origin contact, then for the next 300 seconds serve stale bytes immediately while revalidating asynchronously — user-facing latency stays flat during refresh [Source 2](#source-2). For agents this matters double: an LLM tool call that blocks on a cold origin fetch burns wall-clock against the model's reasoning budget, not just user patience.

The `ETag`

is the agent's escape valve from `max-age`

. When an agent has a hot loop hitting the same resource, it sends `If-None-Match`

and the edge returns `304`

in single-digit milliseconds without round-tripping the body. The tag — `agent:${resource}`

— is what writers grab to invalidate. `revalidateTag`

is Next.js's mechanism for blowing away just the entries that depend on a given key, and the framework prioritizes availability over strict consistency: cache write failures still serve the response, and the next request triggers a fresh render [Source 4](#source-4).

The `Vary: Accept, X-Agent-Consumer`

header is the non-obvious lever. Agents and humans usually want the same resource shaped differently — JSON for the agent, HTML or RSC for the browser. Caching them under one key produces the HTML/RSC inconsistency failure mode where mismatched payloads collide during client-side navigation [Source 4](#source-4). Vary partitions the cache so an invalidation on one variant doesn't strand the other with a different TTL.

Cross-deployment skew is the last hazard. Rolling out a new build mid-flight will serve a mix of old and new payloads from the edge. Setting `deploymentId`

(mirrored here as `X-Deployment-Id`

) triggers a hard navigation on build-ID change so agents and clients re-fetch consistent content [Source 4](#source-4).

```
                        write (DB)
                            │
                            ▼
                     ┌──────────────┐
   POST /invalidate  │  origin app  │  revalidateTag('agent:x')
       ──────────►   │  (Next.js)   │  ───────────────────────►
                     └──────┬───────┘            │
                            │                    ▼
                            │           Cloudflare purge by tag
                            ▼                    │
                 ┌──────────────────┐ ◄──────────┘
   agent GET ──► │  CDN edge (PoP)  │  max-age=30, swr=300
                 └──────────────────┘  Vary: Accept, X-Agent-Consumer
                            │
                  304 (ETag match)  or  200 (fresh body)
```

## When It Breaks

| Condition | What happens | Use instead |
|---|---|---|
Agent loop polls faster than `max-age=30` |
Edge serves identical bytes; no freshness signal reaches the loop | Drop `max-age` to 5s; let `stale-while-revalidate` absorb the rest
|
| HTML and JSON variants cached with different TTLs | Client-side navigation shows mismatched content
|

`Vary`

to partition`max-age`

expiry`revalidateTag`

as authoritative; CDN purge as best-effort backup [Source 4](#source-4)`deploymentId`

; force hard navigation on build-ID change [Source 4](#source-4)[Source 1](#source-1)[Source 3](#source-3)[Source 2](#source-2)`R=1`

read replica behind the origin`R=majority`

for the post-invalidate read path [Source 2](#source-2)`http`

, `agent-json`

) per the Service spec [Source 1](#source-1)[Source 3](#source-3)## CEMENT Brick

If you serve agent-facing endpoints with the same `Cache-Control`

profile you'd use for human HTML, then a single stale tool-call response will poison every downstream inference in a chained agent task, because LLMs cannot distinguish "this data is 60 seconds old" from "this data is wrong" — the only defenses are short `max-age`

paired with `stale-while-revalidate`

for edge offload [Source 2](#source-2), `ETag`

-driven `304`

s for hot loops, tag-keyed `revalidateTag`

purges at write time [Source 4](#source-4), and `Vary`

partitioning so the agent JSON variant and the human HTML variant invalidate independently without colliding [Source 4](#source-4).

## Sources

- Engineering Docs
- Engineering Docs
- Engineering Docs
[How revalidation works in Next.js](https://nextjs.org/docs/app/guides/how-revalidation-works)
