Adding a Live Generate Button: Why Synchronous Beat a Job Queue

A developer added a synchronous 'Generate' button to a content dashboard, replacing an async job queue that required manual server intervention. The endpoint POST /api/drafts/generate?platform= runs the generator synchronously, returning drafts in the response within three to eight seconds. A MAX_PENDING cap prevents over-generation by blocking new requests until existing drafts are reviewed.

The dashboard could display drafts. It could not create them. You watched the queue and waited for a cron job to fire, or you SSH'd into the server and ran the generator manually. For a tool that is supposed to let you manage content, that is a fundamental gap. The fix is a single endpoint: POST /api/drafts/generate?platform= . Hit it, the generator runs synchronously for that platform, new drafts come back in the response, and the UI refreshes. That is the whole thing. It is not clever. The interesting decision was synchronous versus async. Async would be the "correct" answer in a talk on distributed systems. Kick the job onto a queue, poll for completion, show progress. But this is a personal tool running on localhost. The generator takes three to eight seconds depending on how chatty the LLM call gets. Synchronous blocks the request and gives you the drafts immediately. The spinner in the UI is honest: the user sees it while the server is actually working. When it clears, the drafts are there. The only real guard is MAX PENDING . If the queue already has drafts waiting for review, the endpoint returns early before hitting the generator. That keeps you from piling up drafts you will never read. Without the cap, a generate button you hit too often becomes a way to bury yourself in unreviewed output. The cap forces review before generation, not after. The frontend side is straightforward. generateNow in api.ts wraps the POST. The PlatformTab component gets a Generate button with a spinner state and a status string. When the request resolves, it calls the same refresh function the initial load uses. No separate state path for "fresh from generation" versus "loaded from server" because there is no difference at the data level. What I would do differently: the synchronous design works now, but the eight second tail latency on slow LLM calls is going to become annoying fast. The right next step is not a full job queue. It is streaming. Open an SSE connection when the user hits Generate, stream tokens as the draft builds, and let the user read it as it comes. That is more work than a queue but the UX is meaningfully better: you see whether the draft is going somewhere useful before it finishes, which makes the review step faster. The other thing I would revisit is the ?platform= query param. Right now you generate for one platform at a time. That makes sense during development when you want to test one pipeline. But the natural next question is "generate for all platforms that are under the cap." A request body with a platforms array is the cleaner interface there than a repeated query param call. Neither of those is the thing that was blocking the dashboard from being useful. The Generate button was. Ship the thing that makes the tool usable, then clean up the edges.