How I Built Velocity — An AI Platform That Turns Plain English Into Production-Grade n8n Workflows

A developer built Velocity, an AI platform that converts plain-English descriptions into production-grade n8n workflow JSON. The system uses a Next.js 16 frontend with Gemini API and Supabase backend, treating the system prompt as a strict schema contract to avoid common import errors. Velocity generates complete, importable workflows with real triggers, configured parameters, and error notifications.

You describe the automation. Velocity ships the workflow — importable, deployable, running in your n8n instance one click later. Every automation builder has been there. You know exactly what you want: "when a form is submitted, enrich the lead, drop it in a sheet, and ping Slack." Then you spend the next two hours in the n8n canvas hunting for the right node, guessing parameter names, wiring connections, and debugging Could not find property option errors on import. The knowledge to build that workflow exists. It's just locked behind hundreds of node types, typeVersion quirks, and expression syntax. So I built Velocity — an AI platform that takes a plain-English description and generates a complete, valid, importable n8n workflow JSON. Not a 4-node demo. A production-grade pipeline with real triggers, configured parameters, error notifications, and sticky-note documentation — the depth of an actual published n8n template. Velocity is a full-stack AI automation copilot. The core loop: Tech stack: ┌─────────────────────────────┐ │ Next.js 16 App │ │ │ Browser ──────────────▶│ /api/chat ─────────┐ │ │ │ /api/generate-workflow ─┐ │ ┌──────────────┐ │ Supabase JWT │ /api/explain-error ──┐ │ │ │ Gemini │ │ Authorization │ /api/analyze-workflow│ │ ├─────▶│ REST API │ │ header │ /api/plan-workflow ──┘ │ │ │ + fallback │ │ │ │ │ │ models │ │ │ /api/deploy-to-n8n ─────┼──┼──┐ └──────────────┘ │ │ /api/templates ─────────┼──┼──┼─▶ api.n8n.io │ │ /api/conversations ──┐ │ │ │ │ └───────────────────────┼──┼──┼──┼──┘ │ │ │ │ │ │ ┌─────────────────▼──▼──┤ └─▶ Your n8n │ │ Supabase Postgres │ instance └─────────────────────────▶│ RLS: auth.uid │ create + └───────────┬───────────┘ activate │ ┌───────────▼───────────┐ │ Redis │ │ · recent-turns cache │ │ · rate-limit counters│ └───────────────────────┘ Every AI route follows the same pipeline: verify JWT → rate limit → build prompt → call Gemini with failover → repair/parse JSON → persist → respond. Postgres is always the source of truth; Redis is always a cache that fails open. The single hardest problem wasn't code — it was getting the model to emit workflow JSON that n8n will actually import . n8n is unforgiving: a wrong typeVersion loads a different parameter schema, a hallucinated dropdown value throws Could not find property option , and referencing a workflow that doesn't exist throws Could not find workflow . The fix was treating the system prompt like a schema contract, not a vibe. A few of the rules that came directly from watching real imports fail: IMPORT SAFETY — these prevent the two most common load errors: - typeVersion: use 1 for simple core nodes unless you specifically need a newer version's fields. A wrong typeVersion loads a different parameter schema and breaks with "Could not find property option". - Do NOT emit resource-locator objects {" rl":true,"mode":...,"value":...} . Their mode/value pairs are the 1 cause of "Could not find property option". - NEVER reference another workflow you don't have a real id for. - "connections" keys are the SOURCE node's NAME exact, case-sensitive — never its id. Each of those lines exists because a generated workflow broke without it. The prompt also encodes verified parameter shapes for the most common nodes HTTP Request, Set, If, Webhook, Schedule Trigger , because "parameters": {} produces workflows that import but do nothing: - HTTP Request POST : {"method":"POST","url":"...","authentication":"none", "sendBody":true,"body":{"contentType":"json","content":{...}}} GOTCHA: POST/PUT/PATCH MUST include "sendBody":true. And the killer rule for accuracy: when a service has no dedicated n8n node, use a fully-configured HTTP Request node against its REST API instead of guessing. A configured httpRequest beats an empty branded stub every time. Velocity has two AI surfaces that both emit n8n JSON: the conversational copilot /api/chat and the structured one-shot generator /api/generate-workflow . Early on they had separate prompts, and they drifted — a rule fixed in one route would still break in the other. The fix was boring and effective: the n8n contract lives in exactly one file, and both routes compose their system prompts from it. js // n8nPrompt.ts — the single source of truth export const N8N WORKFLOW SHAPE = An n8n workflow is a JSON object that imports cleanly: {...} ; export const N8N RULES = RULES — follow every one: ... ; // Conversational copilot used by /api/chat export const CHAT SYSTEM = You are Velocity, an AI copilot... ${N8N WORKFLOW SHAPE} ${N8N RULES} ... ; // Structured one-shot used by /api/generate-workflow export const GENERATE SYSTEM = You are Velocity, an expert at building n8n workflows... ${N8N WORKFLOW SHAPE} ${N8N RULES} ; If you have two prompts encoding the same output schema, they will drift apart. Shared constants are the cheapest insurance you'll ever buy. I run this on Gemini's free tier, and the preview model I use gemini-3-flash-preview has a 20 requests per day allowance. That's not a rate limit, it's a countdown timer. Rather than let the app die at request 21, the Gemini wrapper detects quota-exhausted 429s and transparently fails over to stable models that still have quota: js export const GEMINI MODEL = "gemini-3-flash-preview"; const FALLBACK MODELS = "gemini-2.5-flash", "gemini-2.0-flash" ; // Signals the primary/fallback loop to try the next model. class QuotaExhaustedError extends Error {} async function generate contents, maxTokens, temperature, jsonMode = false { const models = GEMINI MODEL, ...FALLBACK MODELS ; let lastQuota: QuotaExhaustedError | null = null; for const model of models { try { return await generateWithModel model, contents, maxTokens, temperature, jsonMode ; } catch err { // Only a spent quota triggers failover; every other error is terminal. if err instanceof QuotaExhaustedError { lastQuota = err; continue; } throw err; } } throw lastQuota ?? new Error QUOTA MESSAGE ; } The subtle part is distinguishing the two kinds of 429. A regular rate-limit 429 recovers if you back off; a quota 429 will not clear until tomorrow, so retrying is pure waste. The tell is in the response body: // A 429 whose body mentions quota/billing or "limit: 0" means the key's // free-tier allowance is exhausted — retrying won't help. function isQuotaExhausted body: string : boolean { return /\blimit:\s 0\b|exceeded your current quota|billing/i.test body ; } Transient errors 500/502/503/504 and recoverable 429s get exponential backoff with jitter — and the wrapper honors the API's own Retry-After header and Gemini's RetryInfo body when they suggest a delay. If the suggested delay is longer than the request budget, it gives up early with a human-readable message instead of hanging. I also skipped the SDK entirely. The wrapper is a raw fetch against the Generative Language REST API — ~230 lines including all the retry/failover logic. When your error handling is your reliability story, owning the HTTP layer beats fighting an SDK's opinions. Even with responseMimeType: "application/json" forced, LLM output is only almost JSON often enough to hurt: markdown fences around the object, a stray sentence before it, trailing commas, literal newlines inside string values, and — the worst one — output truncated mid-object at the token limit. A naive JSON.parse raw.slice raw.indexOf "{" , raw.lastIndexOf "}" + 1 fails on all of those. So I wrote parseModelJson , a repair parser that extracts the first balanced JSON value with a proper scanner string-aware, escape-aware , fixes control characters, and — if the input was truncated — closes the open string and any open brackets so the result still parses: // Reached the end with brackets still open → truncated. Best-effort close. if inString out += '"'; out = removeTrailingCommas out.replace /,\s $/, "" ; while stack.length out += stack.pop ; return out; Then it tries candidates from cleanest to rawest, each with a trailing-comma-stripped retry: js for const candidate of extracted, stripped, raw { if candidate continue; const direct = tryParse<T candidate ; if direct == undefined return direct; const repaired = tryParse<T removeTrailingCommas candidate ; if repaired == undefined return repaired; } return null; The same philosophy shows up in the chat flow: when a user asks to deploy "the workflow from our conversation," a depth-tracking scanner walks the assistant messages newest-first, pulls out every balanced {...} slice, and keeps the ones that actually contain a nodes array. Fenced, unfenced, buried in prose — it finds the workflow. Rule of thumb: never let JSON.parse be the last line of defense between an LLM and your user. Every API route authenticates the same way: the browser sends its Supabase JWT in the Authorization header, and the server builds a request-scoped Supabase client that carries that token: export async function getAuth req: Request : Promise<AuthContext | null { const token = bearerToken req ; if token return null; // Request-scoped client carrying the user's JWT, so RLS auth.uid // applies to every query made through it. const db = createClient url, anon, { global: { headers: { Authorization: Bearer ${token} } }, auth: { autoRefreshToken: false, persistSession: false }, } ; const { data, error } = await db.auth.getUser token ; if error || data.user return null; return { userId: data.user.id, db }; } The important choice: routes query through this client, not through the service-role admin client. That means Postgres Row Level Security auth.uid applies to every single query. Even if I write a buggy WHERE clause, user A physically cannot read user B's conversations — the database refuses. The service-role key is used in exactly one place: the signup route, where admin user provisioning genuinely needs it. Chat needs the recent conversation history on every turn. Hitting Postgres for the last 20 messages on every message works, but Redis makes it instant — as long as you're disciplined about what Redis is : js // Prior history: Redis cache, falling back to Postgres and repriming . let history = await getCachedTurns conversationId ; if history === null { const { data: rows } = await auth.db .from "messages" .select "role, content" .eq "conversation id", conversationId .order "created at", { ascending: false } .limit MAX TURNS ; history = rows ?? as Turn .reverse ; await primeCache conversationId, history ; } Both turns are persisted to Postgres before the cache is updated. A cache miss, an expired key, a Redis outage — all of it degrades to a Postgres read and a reprime. Nothing is ever lost because nothing important ever lived only in Redis. The rate limiter follows the same philosophy but inverted — it fails open : // Fails OPEN: if Redis is absent or errors, requests are allowed so chat // keeps working — rate limiting is a safety layer, never a hard dependency. export async function checkRateLimit scope: string : Promise<RateResult { const redis = getRedis ; if redis return { ok: true }; ... } It runs two layered counters — a global one protecting the shared Gemini key's quota across all users, and a per-user one so a single caller can't hog the whole budget. The defaults 5/min, 20/day deliberately mirror the Gemini free-tier quota, so the app's rate limit and the upstream quota fail at the same boundary with a friendly message instead of a raw 429. Generating JSON the user has to manually import is a demo. Deploying it is a product. /api/deploy-to-n8n accepts either a plain-English prompt generate first, then deploy or existing workflow JSON, pushes it to the user's n8n instance via the public REST API, and tries to activate it: js const created = await createWorkflowInN8n workflowDraft, fallbackName ; const activation = await activateWorkflowInN8n created.workflowId ; return NextResponse.json { workflowId: created.workflowId, workflowUrl: created.workflowUrl, // deep link into the n8n editor activated: activation.activated, activationError: activation.activated ? undefined : activation.detail, } ; Note the shape of the response: activation failure is not an error. A workflow with a manual trigger can't be activated — that's expected, so the route reports activated: false with the detail and still hands back the editor URL. Modeling "partial success" honestly beat forcing everything into success/failure. The hardest bugs were import bugs, not code bugs. A workflow can be perfectly valid JSON and still fail n8n's import with Could not find property option . The causes were never in my code — they were in what the model invented : resource-locator objects, wrong typeVersion s, dropdown values that don't exist. The fix was moving n8n's failure modes into the prompt as explicit prohibitions. Every import error became a new rule. The prompt is a changelog of everything that ever broke. "Add more validation code" is usually the wrong first move. My instinct on every bad output was to write a post-processor. But post-processors can't fix a hallucinated parameter schema — they can only detect it. Restructuring the prompt verified parameter shapes, "use httpRequest when unsure" fixed at the source what validation could only reject. Free-tier quotas are an architecture constraint, not an ops detail. 20 requests/day on the primary model shaped real design decisions: the failover chain, the two-layer rate limiter mirroring the quota, distinguishing quota-429s from rate-429s, and honoring Retry-After . If I'd treated the quota as "someone else's problem," the app would be a coin flip. LLMs leak their reasoning into output, and you have to tell them not to. Models love to open with "Okay, the user wants a workflow that..." before the JSON. Two mitigations: a DIRECTIVE suffix appended to every final user turn "Output ONLY the final answer... Begin the answer immediately" , and hard constraints in the chat prompt banning meta-commentary, node-by-node prose dumps, and "here is the workflow" framing. Prompt discipline is output discipline. Every external dependency needs a "not configured" story. Supabase clients return null when env vars are missing. hasGemini gates every AI route. hasN8nConfig gates deployment. Every route degrades to a clear "add X to .env.local" message instead of a crash. It made local development, demos, and partial deployments painless — you can run the marketing page with zero env vars. Next.js 16 is not the Next.js in the model's training data. The repo has a standing rule: read the docs shipped in node modules/next/dist/docs/ before writing framework code, because App Router conventions moved again. When a framework error looks undocumented, check your framework version first — the answer is almost always a recent breaking change. Velocity's core loop — describe → generate → deploy — works end-to-end. What I'm building next: /api/analyze-workflow and /api/explain-error routes are the foundation The core insight hasn't changed: automation knowledge shouldn't be locked behind a node catalog and a canvas. You describe the outcome; the machine should handle the wiring. Built with Next.js 16, React 19, Google Gemini, Supabase, Redis, Tailwind CSS v4, and the n8n REST API. If you've fought with LLM structured output, n8n imports, or free-tier quotas — I'd genuinely love to hear how you handled it. Drop a comment. 👇 — Arish Singh