{"slug": "you-don-t-need-vercel-pro-you-need-your-stack-to-sleep", "title": "You don't need Vercel Pro. You need your stack to sleep.", "summary": "A developer discovered that their Neon database bill was driven by compute time, not traffic, after AI coding agents generated code that prevented scale-to-zero. The database stayed active for 15 hours daily despite zero users, costing $32.65 in compute. The fix involves adding rules to AI agents to avoid patterns that keep the database awake.", "body_md": "TL;DR for vibe coders:Shipped an app with Cursor, Claude Code, or v0 and got a scary Vercel or Neon bill? You probably don't need a bigger plan. You need a few fixes. Not technical? Copy the[folder from the companion repo into your project and tell your AI editor \"apply these cost rules.\" That's the whole job.]`agent-rules`\n\nYou don't need Vercel Pro. You don't need to \"Launch\" on Neon. A lot of the time you need the opposite of an upgrade. You need your stack to go to sleep.\n\nHere's the moment that started this. A Neon bill landed, across three small Next.js apps running on Vercel with a Neon Postgres database, and the breakdown was lopsided in a way that gave the whole game away. $32.65 of it was compute. Storage was 5 cents. History and data transfer were zero. The meter said 308 hours of compute time in about three weeks, which is a database awake for roughly fifteen hours a day, every day.\n\nThe instinct when a bill climbs is \"I must be getting traffic, time for a bigger plan.\" So before doing that, I opened Google Analytics.\n\nZero users in the last 30 minutes. Meanwhile the database compute was pinned active, and had been more or less around the clock.\n\nSo this was never a storage problem or a traffic problem. Almost the entire bill was one thing: paying for a database to stay awake doing nothing.\n\nBoth of those things were true at the same time, and here's the part I'll say up front so nobody has to guess: I didn't hand-write the mistakes that caused this. AI coding agents did. I described features, the agent shipped working code, and that code carried a handful of patterns that quietly defeat scale-to-zero. That's not a confession of bad engineering. It's just what building looks like in 2026. Most of us are vibe-coding onto serverless and cloud infra now, and the agent optimizes for \"make the feature work,\" not \"keep the database asleep when no one's around.\" It has no idea your compute bills by the hour it stays awake, so it has no reason to care.\n\nThat's the whole reason this post exists, and why the fix at the end isn't a clever line of code. It's a set of rules the agent reads on every session. What follows is a short tour of the four patterns that kept the lights on, light on detail because you don't really need to memorize them. You need your agent to stop writing them.\n\nTwo layers, two meters.\n\nNeon bills compute by active time, which is roughly the hours the compute endpoint is running multiplied by its size in compute units. Storage is separate, and a suspended compute costs nothing. ([pricing](https://neon.com/pricing), [plans](https://neon.com/docs/introduction/plans))\n\nVercel bills builds (about $0.0035 per CPU-minute), functions (active CPU, memory, and invocations), bandwidth, and image optimization. ([pricing](https://vercel.com/docs/pricing))\n\nThe headline feature on both is scale to zero. When nothing is happening, the database suspends and your functions aren't running, so you pay close to nothing. Neon's compute auto-suspends after 5 minutes of inactivity by default and wakes again in a few hundred milliseconds. ([scale to zero](https://neon.com/docs/introduction/scale-to-zero))\n\nSo why was an app with zero users never going to sleep? Because \"inactivity\" is carrying a lot of weight in that sentence, and four different things were quietly keeping the stack busy.\n\nBefore you fix anything, you have to be able to see it. There are two traps here.\n\nThe first trap is Google Analytics. GA is client-side JavaScript, so it only fires when a real browser loads your page and runs the script. The traffic that actually hits your functions and database, things like AI crawlers, search bots, uptime monitors, and scrapers, never runs that JavaScript. GA simply doesn't see it. (More on the crawlers in a minute, because they turned out to be the villain.)\n\nThe second trap is that Neon's `active_time`\n\nfield lags. The cumulative `active_time`\n\non the projects API updates on a coarse, roughly hourly cadence, so a quick before-and-after read will tell you nothing changed even when it did. The signal you want is the real-time endpoint state, `current_state`\n\n, which is either `active`\n\nor `idle`\n\n:\n\n```\n# Watch the compute go idle, then suspended, in real time\nNEON_API_KEY=napi_xxx NEON_PROJECT_ID=your-project node measure/endpoint-state.mjs\n```\n\nFor the cumulative proof, take an `active_time`\n\nsnapshot, wait a few hours (overnight is best), and diff it against an unchanged control project, so you know a drop is your fix and not just a quiet hour. Both scripts are in the [companion repo](https://github.com/keqiang/stack-overslept).\n\nOne honest caveat. \"Truly zero\" has a small floor, because Neon's control plane checks availability periodically, so don't expect a perfectly flat line. What you're hunting for is long idle stretches, and those were completely missing here.\n\nThis one isn't about sleep, it's just the easiest money on the table. By default Vercel runs a metered cloud build on every push and every PR preview. Build it yourself and upload the result, and Vercel skips the build entirely:\n\n```\nvercel pull --yes --environment=production   # get project settings + prod env\nvercel build --prod                          # build on YOUR machine into .vercel/output\nvercel deploy --prebuilt --prod              # upload the output; Vercel does not rebuild\n```\n\nPrebuilt-only costs you preview URLs, push-to-deploy, and Git rollback. Run the same step from GitHub Actions instead and you keep all of that with no billed build minutes (see [ examples/03-prebuilt-deploy](https://github.com/keqiang/stack-overslept/tree/main/examples/03-prebuilt-deploy)).\n\nThe worst offender was one innocent-looking line the agent reaches for by default, a module-level connection pool.\n\n``` js\n// keeps the database awake\nimport { Pool } from \"pg\";\nconst pool = new Pool({ connectionString: process.env.DATABASE_URL });\n```\n\nServerless functions reuse module scope across warm invocations, so this pool isn't recreated per request. It lives on, holding a TCP connection to Neon, and a held connection keeps the compute from ever scaling to zero. An app with no users pays for a database awake 24/7. The fix Neon recommends is the stateless HTTP driver: each query is a single HTTPS round-trip with no persistent connection. ([serverless driver](https://neon.com/docs/serverless/serverless-driver))\n\n``` js\n// lets the database sleep\nimport { neon } from \"@neondatabase/serverless\";\nconst sql = neon(process.env.DATABASE_URL, { fullResults: true });\nconst { rows } = await sql.query(\"SELECT now()\");\n```\n\nKeep a real `Pool`\n\nonly in CLI scripts, migrations, and long-running servers. (If you genuinely need a pool inside functions, Vercel's Fluid Compute plus [ attachDatabasePool()](https://vercel.com/docs/fluid-compute) closes the same leak.) Full before-and-after in\n\n`examples/01`\n\n`02`\n\nOne app had a cron polling a table every 5 minutes for work to do. Next to a 5-minute auto-suspend, the math is brutal: every time the database tries to sleep, the cron pokes it awake. Net result is ~100% active, forever, for a job that almost always found nothing.\n\nIf the work is event-driven, don't poll for it. Do it at request time, return the response immediately, and run the slow part in the background with `after()`\n\nfrom `next/server`\n\n. Then delete the cron, so the database only gets touched when something real happens.\n\n``` js\nimport { after } from \"next/server\";\n\nexport async function POST(req: Request) {\n  const job = await createJob(req);                 // fast: the user gets a response now\n  after(async () => { await processJob(job.id); }); // slow part runs after the response\n  return Response.json(job);\n}\n```\n\nKeep crons for genuinely scheduled work like a nightly digest or a daily cleanup, and have them return early without touching the database when there's nothing to do.\n\nWith the pool and the cron fixed, one app still wouldn't sleep. The cause was the traffic GA can't see: LLM crawlers. The app had just shipped a big SEO surface (sitemap, `llms.txt`\n\n, per-entity `llm.txt`\n\nroutes, public JSON), which is catnip for GPTBot, ClaudeBot, PerplexityBot and friends. They crawl hard, and none of them run your analytics JavaScript, so GA showed a clean zero while bots made up the majority of actual hits. ([Cloudflare Radar](https://radar.cloudflare.com/), [Vercel on AI bot traffic](https://vercel.com/blog/the-three-types-of-ai-bot-traffic-and-how-to-handle-them))\n\nWhat made it a cost story: every bot hit touched the database twice. A write, because middleware logged each visit with an `INSERT`\n\ninto a `crawler_hits`\n\ntable. And a read, because the crawler routes were `force-dynamic`\n\n, so each fetch ran fresh queries instead of serving cache. Bots hammering DB-backed dynamic routes around the clock means a database that never sleeps. Three fixes:\n\n`INSERT`\n\non the request path.`force-dynamic`\n\n; a route with `export const revalidate = N`\n\nserves from cache and only hits the DB on revalidation. For `[slug]`\n\n/`[id]`\n\nroutes, add an empty `generateStaticParams()`\n\nso they cache on demand instead of querying every time. (`revalidate`\n\nto the data.After this, a bot crawl serves from the CDN instead of waking the database, and the compute can finally suspend between real events.\n\nThe proof was in the real-time endpoint state and the forward `active_time`\n\ndelta. Long idle stretches appeared where there had been none, the unchanged control project stayed put, and the apps did the same work while spending most of their lives asleep. No plan change required. You can run the same check on your own stack with the [scripts in the repo](https://github.com/keqiang/stack-overslept/tree/main/measure). \"Good\" looks like your database sitting idle or suspended whenever no human is actively using the app.\n\nHere's the payoff to what I said at the top. The agent wrote these patterns, so the agent will write them again next week unless something stops it. Fixing the code once doesn't hold. The durable fix is putting the lessons where the agent reads them: in its rules.\n\nI dropped a compact set of rules into the repo, split into `neon.md`\n\nand `vercel.md`\n\n(one file per platform, so you can use just the one you need), with an `AGENTS.md`\n\nand a `CLAUDE.md`\n\nthat point your editor at them. They say, in plain imperative terms: use the HTTP driver in handlers, don't poll the DB on a timer, never leave DB-backed routes `force-dynamic`\n\n, don't write to the DB on every bot hit, build prebuilt to skip build minutes, and measure with endpoint state instead of GA. Now the agent reads them on every session and prevents the regressions instead of shipping them.\n\nThat's also why the [companion repo](https://github.com/keqiang/stack-overslept) is built the way it is. If you're not deep in serverless billing, maybe you vibe-coded the app and just want the bill to stop, you don't have to understand any of the above. Copy the rules file into your editor and tell your agent to apply it. Cost knowledge belongs in the agent's context, not just in a senior engineer's head.\n\nCopy and paste this into your next serverless project:\n\n`neon()`\n\nin request handlers; `Pool`\n\nonly in CLI and migrations (or Fluid Compute with `attachDatabasePool()`\n\n).`after()`\n\n.`force-dynamic`\n\non DB-backed public routes; ISR with `revalidate`\n\nmatched to the data's cadence; empty `generateStaticParams()`\n\nfor `[slug]`\n\nroutes.`vercel deploy --prebuilt`\n\nto skip remote build minutes.`current_state`\n\nplus `active_time`\n\ndelta and the Vercel usage dashboard, not Google Analytics.A surprise serverless bill is usually not a signal to upgrade. It's a signal that something in your code is keeping the lights on. Turn them off.\n\n*Built three apps on Vercel and Neon and want them to actually scale to zero? The full toolkit, including agent rules, runnable examples, and measurement scripts, is in the companion repo. If it helped, a star helps other people find it.*", "url": "https://wpnews.pro/news/you-don-t-need-vercel-pro-you-need-your-stack-to-sleep", "canonical_source": "https://dev.to/keqiang/you-dont-need-vercel-pro-you-need-your-stack-to-sleep-3lph", "published_at": "2026-06-24 00:50:34+00:00", "updated_at": "2026-06-24 01:14:55.810307+00:00", "lang": "en", "topics": ["artificial-intelligence", "developer-tools", "ai-agents", "ai-products"], "entities": ["Neon", "Vercel", "Cursor", "Claude Code", "v0", "Google Analytics"], "alternates": {"html": "https://wpnews.pro/news/you-don-t-need-vercel-pro-you-need-your-stack-to-sleep", "markdown": "https://wpnews.pro/news/you-don-t-need-vercel-pro-you-need-your-stack-to-sleep.md", "text": "https://wpnews.pro/news/you-don-t-need-vercel-pro-you-need-your-stack-to-sleep.txt", "jsonld": "https://wpnews.pro/news/you-don-t-need-vercel-pro-you-need-your-stack-to-sleep.jsonld"}}