Most AI coding workflows start the same way - you open the agent, describe what you want in a sentence or two, and watch it write code. It feels fast. Then the diff comes back and it built the wrong thing, or the right thing the wrong way, and you spend the next hour correcting it through follow-up prompts. The agent was never confused about how to write the code. It was confused about what you actually wanted.
spec-driven development is a direct response to that failure mode. Instead of prompting an agent straight to code, you first produce a spec - a written description of the requirements, the design, and the tasks - and you correct that spec until it is right. Only then does the spec drive implementation. The decision about what to build happens once, explicitly, on paper, before any code exists.
The term has become popular through tools like AWS Kiro, which put "specs" directly in the IDE, and GitHub's spec-kit, which brings the same idea to Copilot, Claude Code, and other agents. But the idea is older than any of these tools - writing down what a system should do before building it is just engineering, and the AI tooling is a new wrapper around an old discipline.
The shared shape across the tools is a three-part artifact, usually written as plain markdown:
Each part feeds the next. Requirements constrain the design, the design decomposes into tasks, and the tasks are what the agent implements. The human reviews and corrects at each step, so by the time code generation starts, the agent is working from a document you have already agreed with rather than guessing at thousands of unstated details from a one-line prompt.
For most of software's history, the expensive, slow part was writing the code. Specs felt like overhead because the typing was the bottleneck and you could just refactor your way out of a misunderstanding. That economics has changed. When an agent can produce a feature's worth of code in minutes, writing the code is no longer the constraint. The bottleneck moved to deciding precisely what to build. A spec is how you make that decision once, in a form both you and the agent can read. Vague intent forces the model to fill in gaps, and it fills them with plausible guesses that you only discover are wrong after the code is written. A spec moves the guessing forward, into a cheap document, where a wrong assumption costs a sentence to fix instead of a sprint to unwind.
The point of a spec is not documentation. It is to surface the disagreement between what you meant and what the agent understood while that disagreement is still one paragraph instead of two thousand lines.
Reviewing a large diff is reverse-engineering - you read the code and try to reconstruct what the author intended, then judge whether that intent was correct. A spec inverts the order. You agree on the intent first, in language, then the implementation is checked against an intent you already approved. Reading "validate email format and reject duplicates" and deciding it is right takes seconds. Reading the endpoint, the validation, and the database query to infer the same thing takes much longer and you can still miss the gap.
The most expensive mistake in AI-assisted work is building the wrong thing well. A spec is the cheapest place to catch it. If the requirements say the feature should do X and you wanted Y, you find out before a single task runs, not after the agent has produced a polished, tested, completely misaimed implementation.
Once a design is decomposed into discrete tasks with clear boundaries, you can fan multiple agents out across them without them colliding or duplicating work. The spec is the shared contract that keeps independent work coherent. Without it, parallel agents each invent their own interpretation of the same vague goal and you spend the saved time reconciling them.
spec-driven development is not free and it is not always worth it. The discipline carries real costs, and being fair about them is the only way to use it well.
It is overkill for small work. A one-line fix, a copy change, a rename - writing a requirements-design-tasks document for these is slower than just doing them, and pretending otherwise is how a good practice turns into busywork. The size of the spec should match the size and risk of the change, and for trivial work that means no spec at all.
Specs go stale. A spec is only the source of truth if it is maintained. The moment the code drifts from the document and nobody updates the document, the spec becomes a confident, out-of-date lie - worse than no spec, because people trust it. This is the failure that sank earlier model-driven approaches, and it has not gone away.
And agents do not always obey. A larger context window and a detailed spec do not guarantee the model follows every line. Specs reduce ambiguity, they do not eliminate the need to review the output. Some practitioners also find verbose markdown specs tedious to review in their own right, which is a real tradeoff rather than a solved problem. Treat the spec as a tool for making intent explicit, not as a contract the model is forced to honor.