Show HN: Aquifer – a control plane for agentic API traffic Aquifer, a self-hosted API request queue, has been released to control the pace of inbound and outbound agentic traffic and prevent partial outages from cascading. The tool absorbs traffic bursts by durably queuing requests to SQLite and releasing them at a configurable rate, allowing backends and upstream APIs to dictate their own traffic pace. Aquifer supports both inbound protection for a user's own API and outbound rate limiting for external services like OpenAI and Stripe, with automatic backoff based on upstream response headers. Self-hosted API request queue. Controls the pace of inbound and outbound traffic so partial outages don't cascade. APIs get hit in bursts — by agents, schedulers, or high-volume clients. Your backend gets overwhelmed on inbound. Your app gets 429s on outbound. One slow dependency takes everything else down with it. Aquifer absorbs the burst, queues requests durably to SQLite, and releases them at the rate you configure. Your backend decides the pace. The upstream decides the pace. Whoever needs to slow things down — wins. Inbound — protect your API agents / clients → POST /jobs to Aquifer → your backend at controlled RPS Agents hammering your API? Aquifer queues their requests and drains them to your backend at a pace it can handle. Your backend returns X-Aquifer-Rps headers to signal how fast it wants traffic in real time. Outbound — respect external APIs your app → POST /jobs to Aquifer → OpenAI / Stripe / any API at controlled RPS Calling a rate-limited upstream? Aquifer queues the calls and dispatches them at your configured rate. If the upstream signals a slowdown via headers, Aquifer backs off automatically. In both cases — the upstream response headers are the final say on pace. Your config sets the ceiling. Headers can only reduce below it, never exceed it. When pressure clears, the rate recovers gradually back to your ceiling. - Client POSTs a job target URL, method, headers, body, webhook URL and moves on - Aquifer persists it to SQLite — survives crashes, re-dispatches on restart - A per-upstream worker dispatches at your configured RPS with jitter - On completion Aquifer POSTs your webhook with the response body and status - The upstream can adjust the rate live via X-Aquifer- response headers Binary go install github.com/rjpruitt16/aquifer@latest aquifer Docker docker run -p 8080:8080 -v $ pwd /data:/data \ -e DB PATH=/data/aquifer.db \ ghcr.io/rjpruitt16/aquifer Fly.io git clone https://github.com/rjpruitt16/aquifer cd aquifer flyctl launch --name my-aquifer --no-deploy flyctl volumes create aquifer data --size 1 --region iad flyctl deploy Set CONFIG PATH to a YAML file to configure rate limits per upstream hostname: aquifer.yml — copy from aquifer.example.yml defaults: rps: 2 max concurrent: 1 upstreams: api.openai.com: rps: 10 max concurrent: 3 api.stripe.com: rps: 20 max concurrent: 5 your-backend.internal: rps: 50 max concurrent: 10 | Env var | Default | Description | |---|---|---| PORT | 8080 | HTTP listen port | DB PATH | aquifer.db | SQLite database path | CONFIG PATH | none | Path to rate limit config YAML | { "user id": "user-123", "idempotent key": "invoice-42-notify", "url": "https://api.openai.com/v1/chat/completions", "method": "POST", "headers": { "Authorization": "Bearer sk-..." }, "body": "{\"model\":\"gpt-4o\",\"messages\": ... }", "webhook url": "https://yourapp.com/webhooks/aquifer" } Idempotent — duplicate idempotent key per user id returns the existing job. 201 new job queued · 200 + "duplicate": true already exists { "job id": "a3f9...", "status": "queued | in flight | completed | failed", "url": "https://api.openai.com/v1/chat/completions", "method": "POST", "created at": 1715000000000 } Server-Sent Events stream for live job updates. event: queued data: {"job id":"a3f9...","status":"queued"} event: dispatching data: {"job id":"a3f9..."} event: completed data: {"job id":"a3f9...","response status":200,"body":"..."} Or event: failed with {"job id":"...","reason":"..."} . Position updates — while the job waits in queue, a position event is broadcast every 2 seconds: event: position data: {"job id":"a3f9...","position":4} curl -N http://localhost:8080/jobs/