# I built a spec-driven workflow for my AI coding agent. Here is what actually mattered.

> Source: <https://dev.to/felipefontoura/i-built-a-spec-driven-workflow-for-my-ai-coding-agent-here-is-what-actually-mattered-4dkk>
> Published: 2026-07-04 04:47:51+00:00

I shipped a crypto fintech solo last year: 13 apps, three databases, Kubernetes, real money, in about 70 days, with AI agents doing most of the typing. The thing that made it possible was not a better model. It was refusing to prompt.

Here is the problem with prompting a capable agent. It has no memory between sessions. Every conversation starts at zero. Ask it to "add authentication" and it will confidently write 500 lines that compile, pass a lint, and solve a slightly different problem than the one you had, using an architecture you never agreed to. The more capable the model, the further a vague instruction carries it in the wrong direction.

So I stopped prompting and started specifying. I packaged the workflow I settled on into an open-source kit for the [Pi coding agent](https://www.npmjs.com/package/@earendil-works/pi-coding-agent), called [pi-sdd-kit](https://github.com/felipefontoura/pi-sdd-kit). This post is about the two design decisions that actually earned their keep. The rest is detail.

Spec-driven development (SDD) inverts the usual order. The specification is the primary artifact and the code is a consequence of it, not the other way around. In pi-sdd-kit that means every feature moves through a fixed pipeline, with a human approval gate between each phase:

```
IDEA → PLAN → PRD → SPEC → TASKS → EXEC → REVIEW
```

Each phase is a slash command (`/skill:sdd-prd`

, `/skill:sdd-spec`

, and so on). The agent cannot advance to the next phase without your sign-off. That is the whole idea. Everything below is how the sign-off is made real instead of aspirational.

Most people put their coding conventions in a `CLAUDE.md`

(or `AGENTS.md`

) and call it context. That is a start. It is not enough, because conventions tell the agent *how* to write code and say nothing about *what* it is building or *why* the architecture is shaped the way it is.

pi-sdd-kit splits context into a layer that lasts. Steering docs live in `.ai/steering/`

and load every session:

```
.ai/steering/
  product.md       what it is, who it is for, what it is explicitly not
  tech-stack.md    the stack and, more importantly, the reason for each choice
  conventions.md   patterns: how routes, errors, and auth are structured
  principles.md    the non-negotiables ("all money math is integer, never float")
```

The `tech-stack.md`

line that pays for itself is not the dependency list, it is the rationale: "We use PostgreSQL because payment records need ACID guarantees." That one sentence stops the agent from suggesting SQLite three sessions later when you add a service. The steering folder is the reason a solo developer can hold a 13-app system in their head: they do not. The spec holds it.

These files change rarely. When they do, it is because you made a deliberate architectural decision, and updating the file is how that decision becomes permanent context.

`.status`

is the only gate, and file existence is not approval
This is the part I would defend hardest. Each feature spec folder has a one-line `.status`

file:

```
.ai/sdd/specs/001-user-auth/.status
# contents, in order over the feature's life:
#   requirements:approved
#   design:approved
#   tasks:approved
```

The agent reads that file before it does anything, and the rule is blunt: a `design.md`

sitting on disk is **not** a green light. A completed `tasks.md`

is **not** a green light. The only green light is the `.status`

token. The agent prompt is explicit: do not write code before `tasks:approved`

appears.

This sounds obvious until you watch an eager agent see a finished-looking spec in the directory and race straight into implementation. A file and an approved file look identical to something scanning the folder. The status token removes that ambiguity completely. Gates that live in your head are culture; a gate that lives in a file the agent must read is a mechanism. Only the mechanism survives a deadline.

Functional requirements are written in EARS, the Easy Approach to Requirements Syntax from requirements engineering. It is a handful of sentence templates:

```
WHEN a task is completed, THE SYSTEM SHALL record the timestamp and the user.
IF the amount is <= 0, THE SYSTEM SHALL reject with 422 "amount must be positive".
WHILE a task is archived, THE SYSTEM SHALL NOT allow edits.
```

`WHEN`

, `IF`

, `WHILE`

, `SHALL`

. It reads like a contract because it is one, and it leaves the agent almost no room to interpret. This is a 30-year-old format used by Airbus and NASA, and it turns out to be exactly what an amnesiac collaborator needs.

```
pi install npm:@felipefontoura/pi-sdd-kit
# then, in Pi:
/reload
/skill:sdd-init
/skill:sdd-prd      # write requirements
/skill:sdd-spec     # design, after you approve requirements
/skill:sdd-tasks    # break into 2-4h tasks
/skill:sdd-exec     # implement, only after tasks:approved
/skill:sdd-review   # verify against the spec
```

Repo, with the full command reference and templates: [github.com/felipefontoura/pi-sdd-kit](https://github.com/felipefontoura/pi-sdd-kit)

If you have tried spec-driven development and bounced off it, I would genuinely like to hear where the gates felt like overkill. That is the part I am least sure about.