EDD Closes the Loop — But Only Half of It

wpnews.pro

A recent piece by Andrea Laforgia on Expectation-Driven Development (EDD) made the rounds, and it deserves serious attention. The core argument is compelling: AI agents produce code faster than humans can meaningfully review it, so we need a structured protocol for specifying intent before implementation and demanding evidence of fulfillment afterward. The human developer transitions from author to editor — from writing code to evaluating it.

That framing is right. And the EDD workflow — write expectations in plain text, let the agent implement, ask the agent to prove it, challenge the evidence, iterate — is a real improvement over the current default, which is roughly "trust and hope the CI is green."

But EDD solves a specific problem: the gap between human intention and AI implementation. It does not solve the problem that comes next.

To make this concrete, picture a developer asking an agent to fix how discount codes are applied at checkout. The expectation is precise: discounts apply to the pre-tax subtotal, tax is calculated after, an empty cart returns zero rather than an error. The agent implements it, runs the test suite, and produces evidence — three scenarios with real numbers, matching exactly what was specified. The developer reviews the evidence adversarially, pushes back once on a stacked-discount edge case, gets a revised version, and is satisfied. This is EDD working exactly as intended.

EDD ends when the developer is satisfied with the evidence. The expectation has been met. The code works. The diff is ready.

What happens after that?

In most teams, the answer is: it gets merged. Maybe a colleague glances at the diff. Maybe not. The CI is green, the expectations were verified (at least in the agent's own estimation), and the code lands in the main branch.

In our example, the discount fix touched a shared PricingEngine

interface — the same one the inventory team's reservation logic depends on. Nobody chose to ignore that. It simply wasn't part of the expectation. The expectation was about discounts and tax, not about who else reads from that interface. Three weeks later, a reservation bug surfaces that takes two days to trace back to this merge. This is precisely where a different problem begins — the distance between finished code and trusted repository.

EDD is, at its core, an awareness tool. It makes the developer better informed about whether the code fulfills its stated intent. But awareness tools have a structural limitation: their effectiveness depends entirely on whether someone acts on what they now know. A very thorough EDD process can still produce a merge that silently violates an architectural boundary — not because the code is wrong, but because the expectations never captured the right constraints.

Nobody wrote an expectation that said: "This change must not modify a shared interface that three other teams depend on without their knowledge." That kind of constraint does not live in the feature spec. It lives in the structure of the codebase.

Think of the difference as a receptionist versus a turnstile. A receptionist notices if someone heading into a restricted area looks like they don't belong, and says something. Whether that visitor stops depends entirely on whether they care to listen. A turnstile does not notice anything — it simply does not open without the right badge. EDD, however thorough, is a receptionist. It can flag, advise, and warn. It cannot stop a determined merge.

Specification problems ask: Does the code do what I intended? EDD addresses this. It forces developers to articulate intent before implementation and to demand evidence that the intent was fulfilled.

Coordination problems ask: Does this change affect something that belongs to someone else? No amount of expectation-writing resolves this, because the affected parties are not in the room. The constraint is not derivable from the feature spec alone. It requires knowledge of the codebase's actual coupling structure — which files have historically changed together, which teams own which components, where the real boundaries are.

EDD is designed for the first problem. It is not designed for the second. Applying it to the second produces a feeling of rigor without the substance.

The good news is that coordination problems leave traces.

When two components are genuinely coupled — when changing one reliably requires changing the other — that pattern shows up in the commit history. Files that have been modified together repeatedly across time exhibit change coupling: a data signal derived not from someone's opinion about the architecture, but from the actual history of how the codebase evolved.

A seismograph does not predict an earthquake by reasoning about plate tectonics from first principles. It records vibration, and the pattern of past vibration tells you something real about where the next one is likely to originate. Change coupling works the same way: it does not need to understand why PricingEngine

and the inventory reservation logic are related. It only needs to notice that, across forty prior commits, they have moved together twenty-three times. That is enough to raise a flag worth taking seriously.

That distinction — between an advisory generated from a prompt and a trigger generated from data — is the difference between awareness and governance. One is an opinion about what might matter. The other is evidence about what has mattered.

The full workflow for AI-assisted development looks like a loop with two distinct halves:

First half (EDD): Specify intent → agent implements → agent proves → human challenges → iterate to convergence. This closes the gap between what the developer wanted and what the agent produced.

Second half (Change Coupling + Governance): Before merge, check whether the change crosses an ownership boundary that the repository's history suggests is real. If it does, trigger a coordination step — not as a suggestion, but as a requirement.

Neither half replaces the other. EDD without governance produces well-specified code that still merges silently across team boundaries. Governance without EDD produces gates that catch coordination problems but does nothing about specification problems. Together, they address the full distance from intent to repository.

Laforgia is honest about one of EDD's core weaknesses: the fox-guarding-the-henhouse problem. The same AI that wrote the code produces the evidence that the code works.

But the governance half of the loop has its own version of this problem. If the gate is based on static rules — interface changes always require review — then developers learn to route around it. The rules are too blunt. The gate becomes theater.

The alternative is a gate grounded in something the codebase itself produced: coupling patterns derived from the actual commit history, mapped to the actual ownership structure. When the repository's own history says that changes in this area have not historically been local decisions, that is not an opinion. That is data.

The last mile is not a place for better prompts. It is a place for better data — and for making sure that data has teeth.

Originally published on calyntro.com. Calyntro surfaces change coupling patterns from Git history and maps them to team ownership — turning the repository's own history into a governance signal. Explore the live demo.

source & further reading

dev.to — original article Investigating a Hybrid LLM-GNN Model to Enhance the Efficiency of ADAPT-QAOA for Quantum Circuit Optimization I Open-Sourced My AI Agent's Brain. It's 18 Markdown Files. When pytest Said "Passed," It Was Lying

EDD Closes the Loop — But Only Half of It

Run your AI side-project on zahid.host