# How I Built Velocity — An AI Platform That Turns Plain English Into Production-Grade n8n Workflows

> Source: <https://dev.to/arishsingh99/how-i-built-velocity-an-ai-platform-that-turns-plain-english-into-production-grade-n8n-workflows-4p4d>
> Published: 2026-07-04 01:01:24+00:00

You describe the automation. Velocity ships the workflow — importable, deployable, running in your n8n instance one click later.

Every automation builder has been there. You know exactly what you want: "when a form is submitted, enrich the lead, drop it in a sheet, and ping Slack." Then you spend the next two hours in the n8n canvas hunting for the right node, guessing parameter names, wiring connections, and debugging `Could not find property option`

errors on import.

The knowledge to build that workflow exists. It's just locked behind hundreds of node types, `typeVersion`

quirks, and expression syntax. So I built **Velocity** — an AI platform that takes a plain-English description and generates a complete, valid, *importable* n8n workflow JSON. Not a 4-node demo. A production-grade pipeline with real triggers, configured parameters, error notifications, and sticky-note documentation — the depth of an actual published n8n template.

Velocity is a full-stack AI automation copilot. The core loop:

**Tech stack:**

```
                        ┌─────────────────────────────┐
                        │        Next.js 16 App        │
                        │                             │
 Browser ──────────────▶│  /api/chat ─────────┐       │
   │                    │  /api/generate-workflow ─┐  │      ┌──────────────┐
   │  Supabase JWT      │  /api/explain-error ──┐  │  │      │    Gemini    │
   │  (Authorization    │  /api/analyze-workflow│  │  ├─────▶│  REST API    │
   │   header)          │  /api/plan-workflow ──┘  │  │      │ + fallback   │
   │                    │                          │  │      │   models     │
   │                    │  /api/deploy-to-n8n ─────┼──┼──┐   └──────────────┘
   │                    │  /api/templates ─────────┼──┼──┼─▶ api.n8n.io
   │                    │  /api/conversations ──┐  │  │  │
   │                    └───────────────────────┼──┼──┼──┼──┘
   │                                            │  │  │  │
   │                          ┌─────────────────▼──▼──┤  └─▶ Your n8n
   │                          │  Supabase Postgres    │      instance
   └─────────────────────────▶│  (RLS: auth.uid())    │      (create +
                              └───────────┬───────────┘       activate)
                                          │
                              ┌───────────▼───────────┐
                              │  Redis                │
                              │  · recent-turns cache │
                              │  · rate-limit counters│
                              └───────────────────────┘
```

Every AI route follows the same pipeline: **verify JWT → rate limit → build prompt → call Gemini (with failover) → repair/parse JSON → persist → respond.** Postgres is always the source of truth; Redis is always a cache that fails open.

The single hardest problem wasn't code — it was getting the model to emit workflow JSON that n8n will actually *import*. n8n is unforgiving: a wrong `typeVersion`

loads a different parameter schema, a hallucinated dropdown value throws `Could not find property option`

, and referencing a workflow that doesn't exist throws `Could not find workflow`

.

The fix was treating the system prompt like a schema contract, not a vibe. A few of the rules that came directly from watching real imports fail:

```
IMPORT SAFETY — these prevent the two most common load errors:
- typeVersion: use 1 for simple core nodes unless you specifically need a
  newer version's fields. A wrong typeVersion loads a different parameter
  schema and breaks with "Could not find property option".
- Do NOT emit resource-locator objects ({"__rl":true,"mode":...,"value":...}).
  Their mode/value pairs are the #1 cause of "Could not find property option".
- NEVER reference another workflow you don't have a real id for.
- "connections" keys are the SOURCE node's NAME (exact, case-sensitive) —
  never its id.
```

Each of those lines exists because a generated workflow broke without it. The prompt also encodes verified parameter shapes for the most common nodes (HTTP Request, Set, If, Webhook, Schedule Trigger), because `"parameters": {}`

produces workflows that import but do nothing:

```
- HTTP Request (POST): {"method":"POST","url":"...","authentication":"none",
  "sendBody":true,"body":{"contentType":"json","content":{...}}}
  GOTCHA: POST/PUT/PATCH MUST include "sendBody":true.
```

And the killer rule for accuracy: **when a service has no dedicated n8n node, use a fully-configured HTTP Request node against its REST API instead of guessing.** A configured `httpRequest`

beats an empty branded stub every time.

Velocity has two AI surfaces that both emit n8n JSON: the conversational copilot (`/api/chat`

) and the structured one-shot generator (`/api/generate-workflow`

). Early on they had separate prompts, and they drifted — a rule fixed in one route would still break in the other.

The fix was boring and effective: the n8n contract lives in exactly one file, and both routes compose their system prompts from it.

``` js
// n8nPrompt.ts — the single source of truth
export const N8N_WORKFLOW_SHAPE = `An n8n workflow is a JSON object that imports cleanly: {...}`;
export const N8N_RULES = `RULES — follow every one: ...`;

// Conversational copilot (used by /api/chat)
export const CHAT_SYSTEM = `You are Velocity, an AI copilot...
${N8N_WORKFLOW_SHAPE}
${N8N_RULES}
...`;

// Structured one-shot (used by /api/generate-workflow)
export const GENERATE_SYSTEM = `You are Velocity, an expert at building n8n workflows...
${N8N_WORKFLOW_SHAPE}
${N8N_RULES}`;
```

If you have two prompts encoding the same output schema, they *will* drift apart. Shared constants are the cheapest insurance you'll ever buy.

I run this on Gemini's free tier, and the preview model I use (`gemini-3-flash-preview`

) has a **20 requests per day** allowance. That's not a rate limit, it's a countdown timer. Rather than let the app die at request 21, the Gemini wrapper detects quota-exhausted 429s and transparently fails over to stable models that still have quota:

``` js
export const GEMINI_MODEL = "gemini-3-flash-preview";
const FALLBACK_MODELS = ["gemini-2.5-flash", "gemini-2.0-flash"];

// Signals the primary/fallback loop to try the next model.
class QuotaExhaustedError extends Error {}

async function generate(contents, maxTokens, temperature, jsonMode = false) {
  const models = [GEMINI_MODEL, ...FALLBACK_MODELS];
  let lastQuota: QuotaExhaustedError | null = null;

  for (const model of models) {
    try {
      return await generateWithModel(model, contents, maxTokens, temperature, jsonMode);
    } catch (err) {
      // Only a spent quota triggers failover; every other error is terminal.
      if (err instanceof QuotaExhaustedError) { lastQuota = err; continue; }
      throw err;
    }
  }
  throw lastQuota ?? new Error(QUOTA_MESSAGE);
}
```

The subtle part is *distinguishing* the two kinds of 429. A regular rate-limit 429 recovers if you back off; a quota 429 will not clear until tomorrow, so retrying is pure waste. The tell is in the response body:

```
// A 429 whose body mentions quota/billing (or "limit: 0") means the key's
// free-tier allowance is exhausted — retrying won't help.
function isQuotaExhausted(body: string): boolean {
  return /\blimit:\s*0\b|exceeded your current quota|billing/i.test(body);
}
```

Transient errors (500/502/503/504 and recoverable 429s) get exponential backoff with jitter — and the wrapper honors the API's own `Retry-After`

header and Gemini's `RetryInfo`

body when they suggest a delay. If the suggested delay is longer than the request budget, it gives up early with a human-readable message instead of hanging.

I also skipped the SDK entirely. The wrapper is a raw `fetch`

against the Generative Language REST API — ~230 lines including all the retry/failover logic. When your error handling *is* your reliability story, owning the HTTP layer beats fighting an SDK's opinions.

Even with `responseMimeType: "application/json"`

forced, LLM output is only *almost* JSON often enough to hurt: markdown fences around the object, a stray sentence before it, trailing commas, literal newlines inside string values, and — the worst one — output truncated mid-object at the token limit.

A naive `JSON.parse(raw.slice(raw.indexOf("{"), raw.lastIndexOf("}") + 1))`

fails on all of those. So I wrote `parseModelJson`

, a repair parser that extracts the first *balanced* JSON value with a proper scanner (string-aware, escape-aware), fixes control characters, and — if the input was truncated — closes the open string and any open brackets so the result still parses:

```
// Reached the end with brackets still open → truncated. Best-effort close.
if (inString) out += '"';
out = removeTrailingCommas(out.replace(/,\s*$/, ""));
while (stack.length) out += stack.pop();
return out;
```

Then it tries candidates from cleanest to rawest, each with a trailing-comma-stripped retry:

``` js
for (const candidate of [extracted, stripped, raw]) {
  if (!candidate) continue;
  const direct = tryParse<T>(candidate);
  if (direct !== undefined) return direct;
  const repaired = tryParse<T>(removeTrailingCommas(candidate));
  if (repaired !== undefined) return repaired;
}
return null;
```

The same philosophy shows up in the chat flow: when a user asks to deploy "the workflow from our conversation," a depth-tracking scanner walks the assistant messages newest-first, pulls out every balanced `{...}`

slice, and keeps the ones that actually contain a `nodes`

array. Fenced, unfenced, buried in prose — it finds the workflow.

**Rule of thumb: never let JSON.parse be the last line of defense between an LLM and your user.**

Every API route authenticates the same way: the browser sends its Supabase JWT in the `Authorization`

header, and the server builds a **request-scoped** Supabase client that carries that token:

```
export async function getAuth(req: Request): Promise<AuthContext | null> {
  const token = bearerToken(req);
  if (!token) return null;

  // Request-scoped client carrying the user's JWT, so RLS (auth.uid())
  // applies to every query made through it.
  const db = createClient(url, anon, {
    global: { headers: { Authorization: `Bearer ${token}` } },
    auth: { autoRefreshToken: false, persistSession: false },
  });

  const { data, error } = await db.auth.getUser(token);
  if (error || !data.user) return null;
  return { userId: data.user.id, db };
}
```

The important choice: routes query through this client, **not** through the service-role admin client. That means Postgres Row Level Security (`auth.uid()`

) applies to every single query. Even if I write a buggy `WHERE`

clause, user A physically cannot read user B's conversations — the database refuses. The service-role key is used in exactly one place: the signup route, where admin user provisioning genuinely needs it.

Chat needs the recent conversation history on every turn. Hitting Postgres for the last 20 messages on every message works, but Redis makes it instant — as long as you're disciplined about what Redis *is*:

``` js
// Prior history: Redis cache, falling back to Postgres (and repriming).
let history = await getCachedTurns(conversationId);
if (history === null) {
  const { data: rows } = await auth.db
    .from("messages")
    .select("role, content")
    .eq("conversation_id", conversationId)
    .order("created_at", { ascending: false })
    .limit(MAX_TURNS);
  history = ((rows ?? []) as Turn[]).reverse();
  await primeCache(conversationId, history);
}
```

Both turns are persisted to Postgres **before** the cache is updated. A cache miss, an expired key, a Redis outage — all of it degrades to a Postgres read and a reprime. Nothing is ever lost because nothing important ever lived only in Redis.

The rate limiter follows the same philosophy but inverted — it **fails open**:

```
// Fails OPEN: if Redis is absent or errors, requests are allowed so chat
// keeps working — rate limiting is a safety layer, never a hard dependency.
export async function checkRateLimit(scope: string): Promise<RateResult> {
  const redis = getRedis();
  if (!redis) return { ok: true };
  ...
}
```

It runs two layered counters — a **global** one protecting the shared Gemini key's quota across all users, and a **per-user** one so a single caller can't hog the whole budget. The defaults (5/min, 20/day) deliberately mirror the Gemini free-tier quota, so the app's rate limit and the upstream quota fail at the same boundary with a friendly message instead of a raw 429.

Generating JSON the user has to manually import is a demo. Deploying it is a product. `/api/deploy-to-n8n`

accepts either a plain-English prompt (generate first, then deploy) or existing workflow JSON, pushes it to the user's n8n instance via the public REST API, and tries to activate it:

``` js
const created = await createWorkflowInN8n(workflowDraft, fallbackName);
const activation = await activateWorkflowInN8n(created.workflowId);

return NextResponse.json({
  workflowId: created.workflowId,
  workflowUrl: created.workflowUrl,   // deep link into the n8n editor
  activated: activation.activated,
  activationError: activation.activated ? undefined : activation.detail,
});
```

Note the shape of the response: activation failure is **not** an error. A workflow with a manual trigger can't be activated — that's expected, so the route reports `activated: false`

with the detail and still hands back the editor URL. Modeling "partial success" honestly beat forcing everything into success/failure.

**The hardest bugs were import bugs, not code bugs.** A workflow can be perfectly valid JSON and still fail n8n's import with `Could not find property option`

. The causes were never in my code — they were in what the model *invented*: resource-locator objects, wrong `typeVersion`

s, dropdown values that don't exist. The fix was moving n8n's failure modes into the prompt as explicit prohibitions. Every import error became a new rule. The prompt is a changelog of everything that ever broke.

**"Add more validation code" is usually the wrong first move.** My instinct on every bad output was to write a post-processor. But post-processors can't fix a hallucinated parameter schema — they can only detect it. Restructuring the prompt (verified parameter shapes, "use httpRequest when unsure") fixed at the source what validation could only reject.

**Free-tier quotas are an architecture constraint, not an ops detail.** 20 requests/day on the primary model shaped real design decisions: the failover chain, the two-layer rate limiter mirroring the quota, distinguishing quota-429s from rate-429s, and honoring `Retry-After`

. If I'd treated the quota as "someone else's problem," the app would be a coin flip.

**LLMs leak their reasoning into output, and you have to tell them not to.** Models love to open with "Okay, the user wants a workflow that..." before the JSON. Two mitigations: a `DIRECTIVE`

suffix appended to every final user turn ("Output ONLY the final answer... Begin the answer immediately"), and hard constraints in the chat prompt banning meta-commentary, node-by-node prose dumps, and "here is the workflow" framing. Prompt discipline is output discipline.

**Every external dependency needs a "not configured" story.** Supabase clients return `null`

when env vars are missing. `hasGemini()`

gates every AI route. `hasN8nConfig()`

gates deployment. Every route degrades to a clear "add X to .env.local" message instead of a crash. It made local development, demos, and partial deployments painless — you can run the marketing page with zero env vars.

**Next.js 16 is not the Next.js in the model's training data.** The repo has a standing rule: read the docs shipped in `node_modules/next/dist/docs/`

before writing framework code, because App Router conventions moved again. When a framework error looks undocumented, check your framework version first — the answer is almost always a recent breaking change.

Velocity's core loop — describe → generate → deploy — works end-to-end. What I'm building next:

`/api/analyze-workflow`

and `/api/explain-error`

routes are the foundation)The core insight hasn't changed: automation knowledge shouldn't be locked behind a node catalog and a canvas. You describe the outcome; the machine should handle the wiring.

*Built with Next.js 16, React 19, Google Gemini, Supabase, Redis, Tailwind CSS v4, and the n8n REST API.*

If you've fought with LLM structured output, n8n imports, or free-tier quotas — I'd genuinely love to hear how you handled it. Drop a comment. 👇

— Arish Singh
