{"slug": "the-new-code-why-specifications-will-replace-programming", "title": "The New Code: Why Specifications Will Replace Programming", "summary": "A developer built an SDLC harness where AI agents implement features from spec files, but found the bottleneck was underspecified specs. The system treats the spec as source code and generated code as a compiled artifact, with a redesigned pipeline that shares context across tasks to reduce token waste.", "body_md": "The agents were doing exactly what I told them to. That was the problem.\n\nI'd built a pipeline where AI agents could take a spec file, implement a feature, run the tests, review the result, and commit — without me writing a line of code. It mostly worked. Dozens of features shipped. But I kept reviewing the output and feeling like something was off. Not broken. Just subtly wrong in a way that was hard to name.\n\nI spent a while blaming the models. Then the prompts. Then the validation steps. Eventually I had to sit with the obvious: the agents were implementing exactly what I'd written. My specs were underspecified. The bottleneck was always me, at the planning stage.\n\nThere's something that feels right about vibe coding. You're operating at the level of intent — describing what you want and letting the model handle the mechanics. That part is genuinely useful.\n\nBut watch what most people do with the output:\n\n```\nTraditional development:\nSource code  →  Compiler  →  Binary\n(keep the source; regenerate binary anytime)\n\nVibe coding done wrong:\nPrompt  →  LLM  →  Generated code\n(delete the prompt; commit the code)\n```\n\nYou've shredded the source and carefully version-controlled the binary.\n\nThe prompt — your structured description of what you wanted, why, and what \"correct\" meant — is the valuable artifact. The generated code is what compiles from it. When you discard the prompt and commit only the output, you've lost the thing that actually mattered.\n\nThe practical consequence shows up six months later: you're staring at code you wrote and spending twenty minutes reverse-engineering your own intent. The spec would have been a thirty-second read.\n\nI built what I call an SDLC (Software Development Lifecycle) harness — a system where instead of writing code directly, you write a spec describing what needs to be built, and AI agents handle the implementation, testing, review, and documentation.\n\nThe spec is the source. The code is what gets compiled from it.\n\nSimple idea. The interesting part is figuring out how to run that pipeline efficiently. I made some expensive mistakes along the way.\n\nMy original design ran every task through the full pipeline — implement, test, review, document, wrap-up — independently, in an isolated environment, in parallel:\n\n```\ntasks.md\n   │\n   ├── Task 1 (isolated)  →  implement → test → review → doc → wrap  [~200k tokens]\n   ├── Task 2 (isolated)  →  implement → test → review → doc → wrap  [~200k tokens]\n   └── Task 3 (isolated)  →  implement → test → review → doc → wrap  [~200k tokens]\n                                                          ────────────────────────────\n                                                          Total: ~200k × N tasks\n```\n\nOn paper: thorough. In practice: around 200,000 tokens per task. A five-task spec burned through a million tokens before I'd seen any integrated output.\n\nThe waste was structural. Setup phases that didn't vary between tasks ran N times. Per-task reviews could only see one task's changes — they missed integration problems anyway. Documentation ran in isolation before anything was assembled.\n\nAnd the per-task review gave me false confidence: it caught issues *within* a task, but couldn't catch whether the integrated result actually worked. I was paying for isolation on tasks that were mostly sequential.\n\nThe redesign comes from a simpler question: what actually needs isolation, and what can be shared?\n\nThe implement step benefits from a fresh context per task — no bleed between unrelated changes. Everything else — setup, testing, review, documentation — can run once over the integrated result.\n\nThat produces a clean ladder of tools:\n\n```\n  /patch        trivial fix, no tests needed\n      ↓\n  /sdlc-task    one unit → implement → fast-test → fix → commit\n      ↓\n  /sdlc-run     one whole spec, full lifecycle, in-place\n      ↓\n  /sdlc-flow    one whole spec, worktree isolation, produces a PR\n      ↓\n  /sdlc-block   block-level orchestrator, branch train of PRs\n```\n\nPick the rung that matches the scope. A trivial hotfix doesn't need a review stage. A whole feature spec does. Multiple independent feature blocks running in parallel need the orchestrator.\n\nThe biggest change is in the orchestrator. The original version orchestrated individual tasks. The rebuilt version operates at the block level — each block is a complete feature-sized unit of work that runs its own full pipeline in its own branch:\n\n```\n  master-plan.md\n        │\n  sdlc-block orchestrator\n        │\n    Phase 1  ─────────────────────── (parallel)\n    ├── Block A  →  sdlc-flow  →  PR #1\n    └── Block B  →  sdlc-flow  →  PR #2\n        │\n    (after Phase 1 merges)\n        │\n    Phase 2\n    └── Block C  →  sdlc-flow  →  PR #3\n```\n\nAnd inside each `sdlc-flow`\n\n, the lean design runs one fresh implement agent per task, then one consolidated back-half over the integrated result:\n\n```\n  shared setup (once)\n        │\n    Task 1  →  fresh implement agent\n    Task 2  →  fresh implement agent\n    Task 3  →  fresh implement agent\n        │\n  test → review → fix → document → wrap-up\n  (once, over the integrated result)\n```\n\nPer-task agent isolation, at roughly the cost of running it once.\n\nFaster and cheaper, yes. Better results, no.\n\nWhat made the results better was fixing the specs.\n\nI'd been optimizing execution while planning was still the bottleneck. The agents were reading underspecified tasks, making reasonable assumptions, and producing technically correct results that missed the actual intent. No amount of pipeline tuning changes that.\n\nYou can't fix a bad spec with a better pipeline.\n\nThe most interesting thing I've done with this pipeline: I used it to redesign itself.\n\nHere's a condensed version of the master plan I wrote before touching a line of code. The goal was to redesign all four SDLC engines — and the spec-driven pipeline was the tool I used to do it:\n\n```\n# SDLC Engines Redesign — Master Plan\n\n## Goal\n\nFour engines ship in the harness, but they no longer earn their place\nas built. The most expensive one burns ~200k tokens per task running a\nfull review+test+document cycle that's mostly redundant when tasks are\nsequential. Usage data: almost nobody runs it anymore — the simpler\nsingle-spec runner has become the actual default.\n\nThis redesign gives each engine a distinct scope × ceremony tier,\ngives every run a committed token-accounted state trail, and\nrationalizes the planning commands that feed them.\n\n## Architecture\n\n| Engine       | Scope            | When to use                          |\n|--------------|------------------|--------------------------------------|\n| /patch       | Trivial hotfix   | No tests needed                      |\n| /sdlc-task   | One small unit   | Implement → test → fix → commit      |\n| /sdlc-run    | One whole spec   | Full lifecycle, in-place             |\n| /sdlc-flow   | One whole spec   | Worktree isolation + PR              |\n| /sdlc-block  | A full roadmap   | Block-level orchestration, PRs       |\n\n---\n\n## Phase 0 — Foundation\n\n### Block A — Unified token telemetry\n**What:** Every engine writes a committed state file with a token\nusage block after each phase. Persists what was previously\nrender-only output that vanished when a run ended.\n\n**Why:** Token costs were invisible between runs. You couldn't tell\nwhich stage was expensive without watching the live output scroll by.\n\n**Acceptance criteria:**\n- sdlc-run writes a committed state file with a tokens roll-up\n  after each phase\n- sdlc-flow persists per-task token usage into its committed state\n- No engine references a gitignored breadcrumb file\n- node --check clean on both engines\n\n### Block B — Lean single-unit engine\n**What:** Rewrite the task-level engine into a lean single-unit\nrunner: implement → fast validation → fix loop (≤3 attempts) →\ncommit. Delete the heavy stages and coupling flags.\n\n**Why:** Makes small work cheap enough to be worth a dedicated\nengine, and gives the trivial /patch command an intermediate rung\ninstead of jumping straight to the full-spec runner.\n\n**Acceptance criteria:**\n- Engine runs only implement → test → fix → commit\n- Coupling flags are gone (grep-clean)\n- Committed state file with tokens is written\n- node --check clean\n\n---\n\n## Phase 1 — The Headline Change\n\n### Block A — Rewrite the orchestrator as a block-level engine\n**What:** Replace the task-level wave machine with block-level\norchestration. Reads a master-plan file, computes dependency waves\nat block granularity, fans out one sdlc-flow per block in its own\nworktree branch, produces a PR per block by default. Delete the\nlegacy task-level execution engine and its orphaned schema file.\n\n**Why:** The old design's merge-conflict failure mode was structural\n— tasks sharing a worktree conflicting on shared files. Blocks are\nindependent by construction. This eliminates the failure mode\nentirely, and reuses the proven single-spec runner as the inner\nengine rather than duplicating its logic.\n\n**Files:**\n- Modified: sdlc-block.js — full rewrite (keep wave computation,\n  config loader, traced-agent wrapper; add plan-file input,\n  block-level fan-out, branch-train, two-level committed state)\n- Deleted: execution-plan.schema.json — no remaining consumer\n  after the task-level machine is removed\n\n**Interfaces / shared surface:** Reads master-plan-format files\n(the same format /generate-master-plan and /plan produce). Invokes\nsdlc-flow as the inner engine via the workflow() primitive. The\ncommitted block-orchestration-state.json — with child-flow token\nroll-up — is the resume signal and the human review artifact.\n\n**Out of scope:** The /review-PR and /merge-train commands for\nhuman-gated branch-train merging (Block B). The per-block close-out\nquality gate (Block C). The harness config schema rewrite for the\nnew block.* keys (Phase 3). This block may read provisional keys\nand leave the schema update for Phase 3.\n\n**Acceptance criteria:**\n- Orchestrator reads a master-plan-format file and fans out\n  sdlc-flow per independent block\n- Default opens a PR per block; --auto-merge merges in\n  dependency order\n- Committed block-orchestration-state.json written with child\n  token roll-up\n- All legacy task-level code is gone (grep-clean for removed\n  symbols: runTaskWorktree, runTaskInPlace, --from test,\n  --verify-depth)\n- node --check clean on the engine file\n```\n\nA few things worth noting about how this spec works.\n\nThe goal section captures *why this is happening* before describing *what to build*. \"Usage data: almost nobody runs it\" is load-bearing context — without that, an agent might preserve backward compatibility with a behavior nobody is actually using.\n\nThe **Files** field is a commitment, not a description. The agent knows exactly which files to touch and which to leave alone. Anything not listed is out of scope by default.\n\nThe **Interfaces / shared surface** field tells the agent what this block exposes to the blocks that come after it. That's the contract other blocks depend on — changing it mid-implementation breaks downstream work.\n\nThe **Out of scope** field does something most specs skip entirely: it names the things you are *not* building. This matters because an agent trying to be helpful will often implement adjacent things that seem obviously related. Explicit out-of-scope entries prevent that drift.\n\nAnd the acceptance criteria are verifiable against the diff. Not \"the engine works better\" — \"these specific symbols are gone, this file is committed, node --check passes.\" These become the exact checklist the review stage runs.\n\nThe practical test:Can you hand this spec to a smart engineer with zero prior context, and have them build the right thing? If not, the spec isn't done — and an AI agent will fail the same way.\n\nThe flow that works for me now:\n\n```\n  Feature idea or requirement\n          │\n     /plan or /generate-master-plan\n     (mini-roadmap: what, why, blocks, dependencies)\n          │\n     /generate-tasks\n     (executable spec: tasks + acceptance criteria + decisions)\n          │\n     pick the right engine rung\n          │\n     AI agents implement\n```\n\nThe planning step is where the hard questions happen: what does done actually look like, what are the real constraints, where are the edge cases. These are cheaper to answer before implementation starts than after.\n\nWhen I rush through planning, I pay for it in review cycles. The root cause is almost always an underspecified task — and the fix is always \"write a clearer spec,\" not \"use a better model.\"\n\nWhen I slow down at planning, implementations usually ship clean on the first pass.\n\nThe scarce skill isn't writing code. It isn't prompting. It's writing specifications that fully capture intent — clear enough that a stateless agent can make the right call without asking follow-up questions.\n\nThis is hard. It requires thinking through your assumptions before you start. It requires distinguishing between \"I have a vague sense of what I want\" and \"I can articulate what I want precisely enough for someone else to act on.\"\n\nBut it transfers. The same discipline that makes a good agent spec makes a better architecture decision record, a clearer PR description, a more useful design doc. It makes you a better collaborator regardless of whether the person on the other side is human or AI.\n\nThe tooling is changing fast. The underlying skill isn't.\n\nStart with the spec.\n\n*If this was useful, I write about building production AI and agentic systems at learn-agentic-ai.com — including hands-on learning paths available in both English and Brazilian Portuguese. Come build something real.*", "url": "https://wpnews.pro/news/the-new-code-why-specifications-will-replace-programming", "canonical_source": "https://dev.to/bredmond1019/the-new-code-why-specifications-will-replace-programming-58bj", "published_at": "2026-06-25 15:38:28+00:00", "updated_at": "2026-06-25 15:43:34.478000+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "large-language-models", "generative-ai"], "entities": ["SDLC harness", "AI agents"], "alternates": {"html": "https://wpnews.pro/news/the-new-code-why-specifications-will-replace-programming", "markdown": "https://wpnews.pro/news/the-new-code-why-specifications-will-replace-programming.md", "text": "https://wpnews.pro/news/the-new-code-why-specifications-will-replace-programming.txt", "jsonld": "https://wpnews.pro/news/the-new-code-why-specifications-will-replace-programming.jsonld"}}