I built a spec-driven workflow for my AI coding agent. Here is what actually mattered.

A solo developer shipped a crypto fintech platform with 13 apps, three databases, and Kubernetes in 70 days using AI agents, but credits success to a spec-driven workflow rather than better models. The developer created an open-source kit called pi-sdd-kit for the Pi coding agent, which enforces a fixed pipeline with human approval gates between phases and uses steering documents and status tokens to prevent agents from acting on unapproved specifications.

I shipped a crypto fintech solo last year: 13 apps, three databases, Kubernetes, real money, in about 70 days, with AI agents doing most of the typing. The thing that made it possible was not a better model. It was refusing to prompt. Here is the problem with prompting a capable agent. It has no memory between sessions. Every conversation starts at zero. Ask it to "add authentication" and it will confidently write 500 lines that compile, pass a lint, and solve a slightly different problem than the one you had, using an architecture you never agreed to. The more capable the model, the further a vague instruction carries it in the wrong direction. So I stopped prompting and started specifying. I packaged the workflow I settled on into an open-source kit for the Pi coding agent https://www.npmjs.com/package/@earendil-works/pi-coding-agent , called pi-sdd-kit https://github.com/felipefontoura/pi-sdd-kit . This post is about the two design decisions that actually earned their keep. The rest is detail. Spec-driven development SDD inverts the usual order. The specification is the primary artifact and the code is a consequence of it, not the other way around. In pi-sdd-kit that means every feature moves through a fixed pipeline, with a human approval gate between each phase: IDEA → PLAN → PRD → SPEC → TASKS → EXEC → REVIEW Each phase is a slash command /skill:sdd-prd , /skill:sdd-spec , and so on . The agent cannot advance to the next phase without your sign-off. That is the whole idea. Everything below is how the sign-off is made real instead of aspirational. Most people put their coding conventions in a CLAUDE.md or AGENTS.md and call it context. That is a start. It is not enough, because conventions tell the agent how to write code and say nothing about what it is building or why the architecture is shaped the way it is. pi-sdd-kit splits context into a layer that lasts. Steering docs live in .ai/steering/ and load every session: .ai/steering/ product.md what it is, who it is for, what it is explicitly not tech-stack.md the stack and, more importantly, the reason for each choice conventions.md patterns: how routes, errors, and auth are structured principles.md the non-negotiables "all money math is integer, never float" The tech-stack.md line that pays for itself is not the dependency list, it is the rationale: "We use PostgreSQL because payment records need ACID guarantees." That one sentence stops the agent from suggesting SQLite three sessions later when you add a service. The steering folder is the reason a solo developer can hold a 13-app system in their head: they do not. The spec holds it. These files change rarely. When they do, it is because you made a deliberate architectural decision, and updating the file is how that decision becomes permanent context. .status is the only gate, and file existence is not approval This is the part I would defend hardest. Each feature spec folder has a one-line .status file: .ai/sdd/specs/001-user-auth/.status contents, in order over the feature's life: requirements:approved design:approved tasks:approved The agent reads that file before it does anything, and the rule is blunt: a design.md sitting on disk is not a green light. A completed tasks.md is not a green light. The only green light is the .status token. The agent prompt is explicit: do not write code before tasks:approved appears. This sounds obvious until you watch an eager agent see a finished-looking spec in the directory and race straight into implementation. A file and an approved file look identical to something scanning the folder. The status token removes that ambiguity completely. Gates that live in your head are culture; a gate that lives in a file the agent must read is a mechanism. Only the mechanism survives a deadline. Functional requirements are written in EARS, the Easy Approach to Requirements Syntax from requirements engineering. It is a handful of sentence templates: WHEN a task is completed, THE SYSTEM SHALL record the timestamp and the user. IF the amount is <= 0, THE SYSTEM SHALL reject with 422 "amount must be positive". WHILE a task is archived, THE SYSTEM SHALL NOT allow edits. WHEN , IF , WHILE , SHALL . It reads like a contract because it is one, and it leaves the agent almost no room to interpret. This is a 30-year-old format used by Airbus and NASA, and it turns out to be exactly what an amnesiac collaborator needs. pi install npm:@felipefontoura/pi-sdd-kit then, in Pi: /reload /skill:sdd-init /skill:sdd-prd write requirements /skill:sdd-spec design, after you approve requirements /skill:sdd-tasks break into 2-4h tasks /skill:sdd-exec implement, only after tasks:approved /skill:sdd-review verify against the spec Repo, with the full command reference and templates: github.com/felipefontoura/pi-sdd-kit https://github.com/felipefontoura/pi-sdd-kit If you have tried spec-driven development and bounced off it, I would genuinely like to hear where the gates felt like overkill. That is the part I am least sure about.