{"slug": "ludwig-spec-driven-development-mcp", "title": "Ludwig Spec Driven Development MCP", "summary": "Ludwig Spec Driven Development MCP is a specification-driven framework that uses prose-first markdown specs to drive LLM code generation and verification for Rust projects. The framework ensures spec and implementation stay coupled by using the same document for both generation and verification, with deterministic checks via cargo test. Named after Ludwig Wittgenstein, it targets Rust exclusively and supports two workflows: decomposing descriptions into specs and generating code from active specs.", "body_md": "A specification-driven development framework whose specs are written as close\nto natural language as possible. The same prose-first markdown spec drives\nLLM code generation **and** verifies the resulting code. Named after\nLudwig Wittgenstein.\n\nScope:Ludwig targetsRust. Deterministic checks shell out to`cargo test`\n\n, so verifying a project requires a Cargo workspace. The spec format itself is plain markdown, but verification is Rust-specific by design — there is no plan to add adapters for other languages.\n\nToday, when a human asks an LLM to write code, the request is a chat message\nand the contract is whatever the model inferred. The result drifts the moment\neither side is edited. Ludwig replaces the chat message with a\n**specification** — a markdown document with a small fixed shape — that is the\ndurable, versioned source of intent. The same document drives generation and\nverifies the resulting code, so spec and implementation stay coupled.\n\n```\n---\nid: token-bucket-rate-limiter\ntitle: Token-bucket rate limiter\nstatus: active\nimplements:\n  - src/rate_limit/token_bucket.rs\nversion: 4\n---\n\n## Intent\nProtect downstream services from bursty clients by allowing short bursts up\nto a configured capacity while enforcing a steady average rate. The limiter\nis a building block for per-tenant API quotas; it is not, by itself, a\nfairness mechanism between tenants.\n\n## Behavior\n- {#b1} A limiter is created with `capacity` and `refill_rate` (tokens/sec).\n- {#b2} `try_acquire(n)` returns true and consumes n tokens iff at least n are available.\n- {#b3} Tokens refill continuously based on wall-clock time, capped at capacity.\n\n## Examples\n​``` example name=\"burst then throttle\"\nGiven a limiter with capacity 5 and refill_rate 1/sec\nWhen try_acquire(1) is called 5 times in immediate succession\nThen all 5 calls return true\nAnd the 6th call in the same instant returns false\n​```\n\n## Invariants\n- {deterministic} tokens_consumed <= capacity + refill_rate * elapsed_seconds.\n- {property} after waiting C / refill_rate seconds the limiter is full.\n- {judgment} Errors surfaced to callers name the limiter and required wait in plain English.\n```\n\nSections are fixed and ordered: **Intent → Behavior → Examples → Invariants**\n(plus optional **Non-goals**, **Open questions**, **Implementation notes**).\nThe only embedded DSL is `Given/When/Then`\n\ninside Examples — Gherkin-shaped so\nLLMs already know it.\n\nLudwig supports two complementary directions:\n\n**1. Description → specs.** Start from a project or feature description; the\nhost agent decomposes it into specs and drafts each one; Ludwig validates and\npersists; the human reviews before any implementation.\n\n```\necho \"A URL shortener with per-tenant analytics\" | ludwig decompose\n# inside Claude Code:\n/project-decompose A URL shortener with per-tenant analytics\n# agent emits JSON, then for each game/spec calls game.create + spec.propose + spec.write\nludwig catalog && cat specs/_index.md   # review, then move drafts to status: active\n```\n\n**2. Spec → code.** Once a spec is active, generate the implementation, then\nverify.\n\n```\nludwig init                              # one-time scaffolding\nludwig new auth/login --game auth        # OR write a spec by hand\n/spec-generate login                     # inside Claude Code: LLM writes src/ + tests/ludwig_login.rs\nludwig verify login                      # structural + deterministic + judgment-pending\n/spec-verify login                       # inside Claude Code: also evaluates {judgment} invariants\ngit add specs/ src/ tests/ .ludwig/state.json && git commit\n```\n\nWhen you change the spec, `ludwig diff`\n\nflags the implementing files. When you\nchange the code directly, the same diff flags drift on the other side. The\ntrailing `ludwig-spec: <id>@<version> hash=<sha>`\n\ncomment in each implementing\nfile is load-bearing — don't hand-edit it.\n\n**Structural**— frontmatter is well-typed, sections are in order, behavior tags are unique, every file in`implements:`\n\nexists and carries a matching`ludwig-spec:`\n\nstamp. In-process Rust. Sub-second.**Deterministic**— Ludwig scaffolds`tests/ludwig_<slug>.rs`\n\nwith one`#[test]`\n\nper Example and per`{deterministic}`\n\ninvariant, each containing a`todo!()`\n\nbody and a doc-comment with the Gherkin steps. You replace the bodies.`ludwig verify`\n\nshells out to`cargo test --test ludwig_<slug>`\n\nand parses the results.**Judgment**— each`{judgment}`\n\ninvariant is packaged as a prompt and emitted as JSON. Ludwig itself holds no API keys; the host agent (Claude Code) evaluates each prompt and writes verdicts back via`ludwig verify --ingest-judgments <file>`\n\n. Verdicts are keyed by the spec hash, so changing the spec invalidates old verdicts automatically.\n\n`{property}`\n\ninvariants are parsed but not yet machine-verified — no generator\nruns. Rather than silently pass, Ludwig reacts to the spec's status: on an\n`active`\n\nspec each property invariant reports `fail`\n\n(you can't rely on an\ninvariant nothing checked), and on a draft/deprecated spec it reports `skip`\n\n.\nSee `docs/specs/property-invariants-deferred.spec.md`\n\n.\n\nThe `canonical:`\n\nsetting in `ludwig.yml`\n\ndecides which side is the source of\ntruth when a spec and its code diverge:\n\n`spec`\n\n(default) — the spec leads. On drift, the code is stale;`ludwig diff`\n\ntells you to regenerate or bump the spec version.`code`\n\n— the code leads (spec-from-code). On drift, the spec is the stale side;`ludwig diff`\n\ntells you to update the spec to match the code, then re-verify. The skill guidance flips accordingly.\n\nAn unknown value is rejected at load time. (Ludwig does not yet *derive* a spec\nfrom code — see Deferred.)\n\n```\ngit clone <this repo> && cd ludwig\ncargo install --path .\n```\n\nOr, from a release build, symlink the binary onto your PATH:\n\n```\ncargo build --release\nsudo ln -s \"$PWD/target/release/ludwig\" /usr/local/bin/ludwig\n```\n\nRequires a Rust toolchain (the Ludwig binary), and — in any project Ludwig\nverifies — whatever runs `cargo test`\n\n.\n\n| Command | Purpose |\n|---|---|\n`ludwig init` |\nScaffold `ludwig.yml` , `specs/` , `.ludwig/` , register Claude Code skill |\n`ludwig new SLUG [--game G]` |\nScaffold a new spec from a blank template |\n`ludwig move SLUG [--to-game G] [--force]` |\nRelocate an existing spec to a different game |\n`ludwig decompose` |\nEmit a prompt to break a project description (stdin or `-d` ) into specs |\n`ludwig propose SLUG -d DESC [-g GAME]` |\nEmit a prompt for drafting a single spec |\n`ludwig write-spec SLUG [-g GAME]` |\nValidate spec markdown on stdin and persist it |\n`ludwig game-new NAME [-i INTENT] [-x Term:Defn ...]` |\nCreate a language-game manifest |\n`ludwig parse [PATH] [--quiet]` |\nParse one or all specs; report structural errors |\n`ludwig plan ID` |\nEmit JSON generation brief for the host agent |\n`ludwig verify [ID] [--all] [--json]` |\nRun the full pipeline; write report under `.ludwig/reports/` |\n`ludwig verify [ID] --emit-judgment-prompts` |\nPrint judgment prompts as JSON |\n`ludwig verify --ingest-judgments FILE` |\nIngest judgment verdicts back into state.json |\n`ludwig diff [ID] [--all] [--json]` |\nSurface drift between specs and code |\n`ludwig catalog` |\nRegenerate `specs/_index.md` |\n`ludwig mcp [--root PATH]` |\nStart the MCP server over stdio |\n\nLudwig speaks MCP (Model Context Protocol) over stdio. Register it with Claude Code:\n\n```\nclaude mcp add ludwig -- ludwig mcp\n# or, scoped to one project:\nclaude mcp add --scope project ludwig -- ludwig mcp\n```\n\nThe server discovers the project at each tool call (`$PWD`\n\n, then\n`$LUDWIG_PROJECT`\n\n, then `--root`\n\n).\n\n| Tool | Purpose |\n|---|---|\n`spec.list` |\nList all specs in the project |\n`spec.read` |\nReturn the parsed structure of a spec |\n`spec.plan` |\nGeneration brief (drives code generation) |\n`spec.verify` |\nRun the verification pipeline |\n`spec.diff` |\nDrift report between a spec and its implementing files |\n`spec.propose` |\nEmit a prompt for drafting a spec from a description |\n`spec.write` |\nValidate agent-drafted markdown and persist it |\n`spec.move` |\nRelocate a spec into a different game |\n`spec.ingest_judgments` |\nPersist judgment verdicts inline (no file path needed) |\n`project.decompose` |\nEmit a prompt to break a project into specs + games |\n`game.create` |\nCreate a language-game (`_game.md` ) |\n\nResources:\n\n`ludwig://spec/<id>`\n\n— raw spec markdown`ludwig://report/latest`\n\n— most recent verification report\n\nOnce `ludwig mcp`\n\nis registered, you can drive an entire project from a single\nchat session. The agent calls Ludwig's MCP tools instead of you running the\nCLI by hand; you stay in the loop for review.\n\n-\n**Scaffold the project.** In an empty directory, run`ludwig init`\n\nonce. This creates`ludwig.yml`\n\n,`specs/`\n\n,`.ludwig/`\n\n, and the Claude Code skill manifest. The MCP server discovers this root on every tool call. -\n**Describe the app.** Tell the agent what you want to build, e.g.*\"Build a URL shortener with per-tenant analytics.\"*The agent calls`project.decompose`\n\nto get a prompt, then proposes a set of games (`auth`\n\n,`shorten`\n\n,`analytics`\n\n, …) and a draft spec per behavior. -\n**Persist the drafts.** For each game the agent calls`game.create`\n\n; for each spec it calls`spec.propose`\n\nto get the drafting prompt, then`spec.write`\n\nto validate and persist the markdown. Anything malformed is rejected with a structural error — the agent retries until the spec parses. -\n**Review and activate.** You read`specs/_index.md`\n\n(regenerated by`ludwig catalog`\n\n), resolve any`Open questions`\n\n, and flip drafts to`status: active`\n\n. Active is the gate: nothing generates code until you approve the intent. -\n**Generate code.** The agent calls`spec.plan`\n\nper active spec to get the generation brief (resolved glossary, dependencies, behavior tags, examples) and writes the implementation and test bodies. Each generated source file ends with the`ludwig-spec: <id>@<version> hash=…`\n\nstamp. -\n**Verify.** The agent calls`spec.verify`\n\n, which runs structural and deterministic layers and emits judgment prompts. The agent evaluates each judgment prompt itself, then calls`spec.verify`\n\nagain with the verdicts ingested. The report lands under`.ludwig/reports/`\n\nand is also reachable as`ludwig://report/latest`\n\n. -\n**Iterate.** When you edit a spec, the agent re-runs`spec.plan`\n\nand regenerates the affected files; when you edit code, drift detection flags the seam. The spec is the durable artifact — chat turns are not.\n\nA minimal driver prompt: *\"Use the Ludwig MCP tools. Decompose this\ndescription into specs, write them, wait for me to activate, then generate\nand verify.\"* The agent does the rest.\n\n```\n<project root>/\n  ludwig.yml                       # project config\n  specs/\n    _index.md                      # generated: spec catalog\n    auth/\n      _game.md                     # language-game manifest\n      login.spec.md\n  src/\n    auth/login.rs                  # ends with `// ludwig-spec: login@1 hash=…`\n  tests/\n    ludwig_login.rs                # scaffolded once, then user-owned\n  .ludwig/\n    state.json                     # spec hashes, file fingerprints, judgment verdicts\n    reports/                       # verification reports (JSON + latest.md)\n  .claude/\n    skills/\n      ludwig.yaml                  # registered by `ludwig init`\n```\n\n`<slug>.spec.md`\n\nmatches the frontmatter `id`\n\n. IDs are globally unique within\na project.\n\nv0.1 ships:\n\n- Strict, hand-rolled markdown spec parser\n- Project scaffolding, catalog, language-game inheritance\n- Generation brief (\n`ludwig plan`\n\n) with transitively-resolved`depends_on`\n\n- Rust adapter: scaffolds\n`#[test]`\n\nstubs from Examples + invariants, runs via`cargo test`\n\n- Structural + deterministic verification, judgment-prompt round-trip\n- Three-way drift detection (\n`ludwig diff`\n\n) — stale stamp, body changed, missing, unstamped - Claude Code skill manifest + JSON-RPC 2.0 MCP server over stdio\n- 60-test integration suite\n\nDeferred:\n\n- Property-based\n**generation**—`{property}`\n\ninvariants are parsed and their verification policy is defined and tested (active →`fail`\n\n, non-active →`skip`\n\n; see \"Verification, in three layers\"), but no generator produces or runs property tests yet. **spec-from-code generation**—`canonical: code`\n\nmode flips drift semantics so the spec is the stale side (see \"Canonical direction\"), but Ludwig does not yet derive or auto-update a spec from existing code; you update the spec yourself, then re-verify.\n\n```\ncargo build --offline    # the cached crate set is sufficient\ncargo test --offline\ncargo run --offline -- version\n```\n\nMIT.", "url": "https://wpnews.pro/news/ludwig-spec-driven-development-mcp", "canonical_source": "https://github.com/samdvr/ludwig", "published_at": "2026-06-26 04:40:50+00:00", "updated_at": "2026-06-26 05:05:01.195867+00:00", "lang": "en", "topics": ["developer-tools", "large-language-models", "generative-ai", "ai-agents"], "entities": ["Ludwig", "Rust", "Claude Code", "Ludwig Wittgenstein"], "alternates": {"html": "https://wpnews.pro/news/ludwig-spec-driven-development-mcp", "markdown": "https://wpnews.pro/news/ludwig-spec-driven-development-mcp.md", "text": "https://wpnews.pro/news/ludwig-spec-driven-development-mcp.txt", "jsonld": "https://wpnews.pro/news/ludwig-spec-driven-development-mcp.jsonld"}}