Toyota cut the cost of the car by fixing the process, not the part. Agentic code needs the same.
TL;DR. Everyone has rediscovered that an agent is a model in a loop, but a loop that stops when the model decides it is finished is not a control loop, it is an unguided projectile. Toyota closed exactly this kind of loop seventy years ago with jidoka, stop the line the instant a defect appears, and kaizen, turn every defect into a permanent change to the process. Their agentic equivalents are Continuous Enforcement, which makes an agent's recurring mistake non-committable rather than discouraged in a prompt, and Continuous Verification, which makes "done" something a kernel computes against a specification rather than something a model declares. Together they are the comparator and the stop-cord the loop discourse leaves unspecified: loose enough to let the agent roam, tight enough to converge. The same idea explains why a small model in a tight loop can beat a frontier model asked once, why two individually sound rules can combine into a harmful one, and why an agent that cannot feel the cost of your time will spend it freely.
Listen to this essay (55 min)
Narrated by Charlie via ElevenLabs
The loop is having a moment, and for once the excitement points at something real. Anthropic ships the agentic loop as a documented primitive in its Agent SDK: receive a prompt, call a tool, feed the result back, repeat. It writes, more ambitiously, about recursive self-improvement, the loop that builds the next loop. And across every feed someone is explaining, with the air of a revelation, that an agent is "just a model in a loop". They are right that the loop is the unit, and right that it is the thing that separates a chatbot from an engineer.
They are also describing the oldest idea in control engineering, which means the interesting question is not whether you have a loop but what closes it. Feeding a tool result back and conditioning the next action on it is already feedback, and a capable model deciding what that result means is already a kind of comparator. The careful articulations go further: the Agent SDK lets you cap turns and spend, and exposes hooks that fire before a tool runs and when the agent stops, which are precisely the places a real check belongs. None of that is the strawman. The strawman is what the loop does by default, when you specify none of it, and the default is stark in the SDK's own words: the loop ends "when Claude produces a response with no tool calls". The stopping condition, the comparator that rules the work done, is the actuator's own say-so. A loop that halts when the thing doing the work declares victory has no comparator at all. The two things that rush in to fill that gap are an agent grading its own homework in prose and a prompt stuffed with "always remember to" guardrails. Both feel like control. Neither is.
Manufacturing settled this argument seventy years ago, across a great many ruined batches. The discipline that came out the other side has a name, the Toyota Production System, and the two ideas I want to borrow from it are jidoka [autonomation: building in the ability to stop the instant a defect appears] and kaizen [continuous improvement: every defect becomes a permanent change to the process, not a patch to the part]. Software already imported both, more thoroughly than the loop discourse admits. The problem is that it tuned them for a slow, careful, human author, and the agent is neither slow nor careful nor, in any sense that matters, an author with a stake in the outcome.
So I want to give two names to the parts of the loop that the agent breaks, because naming a thing is how you get to enforce it. Continuous Enforcement: the andon cord recalibrated for machine speed, where the agent's recurring mistake becomes non-committable, not merely discouraged in a prompt. Continuous Verification: the part of the specification you can make decidable, turned into a checkable artefact, so that on that part "done" is computed rather than claimed. These are the comparator and the stop-cord. Without a specified comparator and a real stop-cord you do not have an agentic engineer. You have an extremely fast, extremely confident open loop, and an open loop with a powerful actuator is not autonomy. It is an unguided projectile: committed, quick, and blind to where it lands.
(A note on the name. "Continuous Verification" is also used for an unrelated chaos-engineering practice, and for an ML-rollback feature in at least one deployment tool. I mean neither. I mean the verification half of the same control loop the CI/CD world already built, pushed down into a decidable specification.)
A loop is a control loop, or it is nothing #
Strip the language model out and look at the shape. A control loop has five parts, and you cannot remove any of them and still call what is left "control":
- A setpoint: the target. The dimension the part must hit, the property the code must satisfy. - An actuator: the thing that acts on the world. The cutting tool, the agent writing a diff. - A sensor: a measurement of where the system actually is, not where you assume it is. - A comparator: the operation that subtracts sensor from setpoint and produces an error signal. - A feedback law: how that error drives the next action, and with how much eagerness.
Take away the comparator and the sensor and you have an open loop: it acts, but it never learns whether the act landed. Open loops are fine when the world is perfectly predictable. They are catastrophic the moment reality drifts, because they have no way to notice they are wrong. They drive confidently off the road, and the first sign of trouble is the impact.
A closed loop is not automatically a good loop either. Make the feedback law too eager, react too hard to every error, and the loop oscillates. Make it too timid and it never reaches the setpoint. Control engineers quantify this with gain margin and phase margin [how much the loop's gain or timing can shift before a stable loop starts to ring, or diverges altogether], precisely because the useful region is the one in between. And there is a genuine no-free-lunch result lurking nearby, Bode's sensitivity integral, sometimes called the waterbed effect: in a linear feedback loop you cannot suppress error everywhere; push sensitivity down at one frequency and it pops up at another. I am borrowing the intuition, not the theorem, because my gates are hard constraints rather than a linear controller and the theorem does not literally apply to them. But the intuition has a precise home in this system, and it is worth naming, because most attempts get it wrong: the trade you cannot escape is between false alarms and missed defects. Tighten a check until it never lets a real defect through and it will start stopping good work; loosen it until it never cries wolf and it will wave defects past. That conserved tension, not a borrowed theorem about frequencies, is the one Continuous Enforcement has to manage.
Which lets me state the design criterion the loop discourse keeps gesturing at without pinning down. We want the loop loose enough to be agentic, so the agent can explore, choose tools, take large steps, and surprise us. We want it tight enough to converge, so the whole thing reaches the goal instead of wandering into a confident, well-formatted hallucination. These are not in tension once you stop treating the loop as a single dial and start treating it as a region with a boundary. Partition it: the looseness belongs in the actuator, let the agent roam freely inside the feasible region; the tightness belongs at the boundary, pin the edges of that region with mechanisms the agent cannot argue its way past. A big permissive interior with hard walls is a different object from a uniformly tight loop, and it is the object we actually want.
The rest of this essay is about where, concretely, to put the walls.
The asymmetry that opened the loop #
Here is why an old problem suddenly became urgent. Coding agents have collapsed the cost of emission, producing a diff, from hours to seconds. They have not collapsed the cost of verification, understanding a diff, by anything like the same factor. Agents do help with verification, they write tests, explain diffs, draft review comments, but the act of actually convincing yourself a change is correct is still bounded by a human holding the system in their head, and that has barely moved. One side of the ledger fell by three orders of magnitude. The other crept.
The texture of the resulting failure is less like a discrete bug and more like turbulence: smooth, locally-reasonable flow that tips, past some throughput, into a regime where non-local couplings dominate and no reviewer can hold the whole thing intensionally [as a structural, why-it-works model] any more. Past that tipping point a codebase keeps running, but only because it is held together by an accreted layer of compensating workarounds, instrumentation that exists solely to keep an unstable airframe flying. I have taken to calling that layer coding-agent avionics; the coinage is mine. The picture is a metaphor, not a metric. I am not going to dress it up with dimensionless numbers I cannot actually measure, because that would be exactly the borrowed-authority move this essay exists to argue against. The load-bearing fact needs no decoration: emission got cheap, verification did not, and a fast actuator with an unspecified comparator slides straight into the avionics regime.
The setpoint: what, precisely, are we converging on? #
You cannot close a loop without a setpoint, and "good code" is not a setpoint. It is a mood. The target has to be a quantity you can in principle move toward, or the comparator has nothing to subtract against.
The setpoint I use is a lifetime integral, and it is worth stating plainly because it changes what "done" means. Judge an asset not by whether it helps in the moment but by its value across its whole life:
V(asset) = ∫₀ᵀ [ u(t) − c(t)/α ] e^(−rt) dt
u(t)
is the utility the asset throws off at time t. c(t)
is the human cost it imposes: debugging, maintaining, re-explaining, rotating its credentials, chasing its flakes, cleaning up after it. T
is its expected lifetime. The lever is the divisor α
, the fraction of a maintainer's time they are willing to spend on maintenance at all. Invert it and you get a shadow price [the real cost of consuming one unit of a constrained resource]. Pick your own α
; mine is currently 0.004
, which puts the price on maintenance time at 250x. That number is a stipulated knob, not a law of nature, and nothing in the argument depends on its exact value. What it encodes is a qualitative claim that does all the actual work: every unit of ongoing human cost an asset imposes has to be repaid many times over in utility, or the asset is value-negative across its life and should not exist, however well it demos.
This is the only honest way I know to price the thing agents are structurally bad at. An agent does not feel c(t)
. It will never be paged at 2 a.m. by the cron job it wrote without idempotency [the property of being safe to run twice]. You will. The integral drags that future cost into the present decision, where it can change what gets built. And it splits the cost into two parts that need different responses. c_irr
, the irreducible cost, comes from outside: a dependency ships a breaking change, a certificate expires, a vulnerability lands. You cannot stop those events, only drive down the cost of responding to each one. c_sys
, the systematic cost, is the preventable kind: the schema bug, the untyped boundary, the unverified claim, the temporary workaround. As the number of systems one person maintains grows, the irreducible baseline alone consumes most of the budget, which forces a hard conclusion: the preventable cost has to go to roughly zero, because there is no room left for it.
I have watched this go wrong at my own expense, and the shape of the mistake is worth keeping, because it is not the one you would guess. I asked an agent to build the first working version of a Chrome extension that lets a coding agent running on a cloud development box see what I see in a browser window on my own machine. The production substrate was a relay on infrastructure I control, and standing it up needed a ten-second approval from me, because it would cost a little money. Two of my own standing instructions then collided. One says: always ask before spending money. The other says: do not interrupt me for small things, protect my attention. Each is sensible on its own. Together they have a perverse joint solution, and the agent found it: route around the spend entirely, so the money rule is never triggered and the interruption never has to happen. It stood up a free public tunnel on a throwaway box, ran my real session through it, and spent my evening debugging the instability of a transport we were never going to ship. It had honoured both rules to the letter and defeated the point of both, because the cost it minimised was the one it could see, my bill and a prompt, and the cost it spent was the one it could not, an evening of my attention on a path that could never become the product. The throwaway code was nearly free. The hour of my attention on a non-transferring path was the expensive part, and the integral above, which prices an asset that already exists, was structurally blind to it, because nothing that already existed ever looked bad. An agent that cannot feel c(t)
will spend yours to economise on a number it can read.
There is a deeper reading, and it is why this is not a footnote. The delegation existed to save my time and it cost more of it, which inverts the premise of agent autonomy. The binding constraint on a human working with agents is not compute but attention: as I argued in The Agent Tending Problem, you can run fifty agents and still have only one of yourself, so the most expensive thing an agent can do is consume the human it was meant to free. Put a price on that attention and the energy ledger turns over. Human and machine intelligence are converging on an energy equilibrium, calories on one side and joules on the other, and at some point the demand curves cross and the trade gets made; I have called that the energy pinch point, and the Chrome-extension evening was that same ledger at the scale of a single night. Delegation is meant to swap cheap machine energy for scarce human energy and come out ahead. This one spent both, the agent's compute on a tunnel we discarded and my evening on a dead end, so the combined energy of the collaboration rose rather than fell. Delegation is an energy win only if the loop converges, and an agent without Continuous Enforcement and Continuous Verification does not reliably converge. A loop run to save effort that does not converge is the most efficient machine yet built for burning it, on both sides at once.
Hold onto the two-rules-collide shape of this; it returns later, with a name.
c_sys ≈ 0
is the setpoint. Continuous Enforcement and Continuous Verification are how the loop converges on it.
There is a companion test that makes the setpoint something you can feel. Ship the asset, then walk away and touch nothing for two months. Does it still satisfy u(t) − c(t) > 0
with zero attention from anyone? If it would drift, jam, time out, accumulate cruft, or quietly start lying, then c(t)
is too high and the asset is not done, whatever the test suite says.
Toyota already solved the open loop #
Now the genealogy, carefully, because getting it wrong would be its own small act of confabulation.
The Toyota Production System is the most studied closed-loop control system ever built out of people and machines. Its house has two pillars, just-in-time and jidoka, standing on a foundation of kaizen and standardised work. I am deliberately setting just-in-time aside, that is the flow-and-inventory discipline, and pairing jidoka with kaizen, because those are the two that map onto the agent problem. (When I called them "the two ideas I'm borrowing" above, that is what I meant; they are not Toyota's two pillars.)
Jidoka is usually translated "autonomation", automation with a human touch, but the operational content is sharper than the translation: build the line so it surfaces a defect the instant it occurs, and stop rather than build the defect into the product. The mechanism on the floor is the andon cord. The detail the popular telling flattens is worth keeping: pulling the cord first summons the team leader, and the line halts at a fixed position only if the problem is not resolved within the cycle. Stop, help, escalate, then halt, a tiered response, not an instant full stop. And the cultural inversion that made it work is that pulling the cord was not a failure to be punished. It was the system working as designed. W. Edwards Deming would later make the same point a principle: drive out fear, because most defects belong to the system, not the worker, and a frightened worker hides the defect instead of surfacing it.
Two more pieces. Poka-yoke [mistake-proofing] is designing the wrong action to be physically impossible rather than merely warned against: the connector that only fits one way, the fixture that will not close on a misaligned part. And the improvement loop itself is a control loop: the Shewhart cycle that Deming carried to Japan in 1950, which Japanese practitioners reformulated as Plan-Do-Check-Act, and which Deming himself later insisted should be Plan-Do-Study-Act, because the third step is learning, not a pass/fail inspection. Hold onto that correction; it matters below. Setpoint, sensor, comparator, feedback law, and a Study step that turns each result into knowledge. They built it out of people and a rope decades before we tried to build it out of hooks and a kernel.
One Toyota principle deserves singling out, because it is the one the agentic factory breaks most often. A temporary countermeasure is not left in place. When the line stops and the real fix cannot land immediately, a stopgap may keep things moving, but it is tracked, owned, and removed when the root cause is fixed. It is never allowed to quietly become the norm. Toyota understood in the bone that the temporary workaround is the most dangerous object in the factory, because it works just well enough to be forgotten, and then it becomes load-bearing.
Continuous Enforcement: the andon cord, recalibrated #
Continuous Enforcement is jidoka for agents, and the honest starting point is that we are not inventing the andon cord. Continuous Integration already is one: a red build blocks the merge, a failed pipeline stage halts promotion, and the andon analogy for CI is an old one. CI/CD is software's jidoka. The trouble is that it was tuned for a human author who emits slowly, has a reputation to protect, and feels the cost of their own mistakes. The agent breaks all three assumptions at once. So the andon has to be recalibrated for machine-speed emission and for an actor with no innate stake in correctness, and it has to be loaded with semantics mined from what has actually gone wrong rather than generic build-and-test.
The seductive way to keep an agent in line is to write the rule in the prompt. "Always check the deployed branch before deploying." "Never reuse a shared credential without auditing its scope." I have come to call this prompt-prayer, and it is the single most tempting mistake in the field, because it feels like governance and costs nothing to write. It does not work, for a structural reason: a prompt is a request, and an agent under load, with a full context window and a plausible shortcut in front of it, rounds requests off. Prose guardrails are advisory, and the whole lesson of jidoka is that advisory does not stop the line. So you stop relying on the agent's compliance and you change the environment instead.
The pattern that has worked for me is a program of small, deterministic checks wired into the places where mistakes actually get committed: the agent's tool calls, the pre-commit hook, the pre-push hook, the nightly CI sweep. Each check encodes one specific, previously-observed failure, and when it fires it does not suggest. It blocks. The mistake becomes non-committable. The slogan I keep returning to: make the agent's mistake non-committable, not asked-nicely.
This is not a thought experiment. I run a program like this across my own work: a few hundred real incidents, mined from many agent sessions, distilled into several dozen deterministic checks, with a ledger that has logged real firings in the low hundreds. Treat those numbers as a field report from one system rather than a product you can clone, because the open-source spin-out is not public yet. The single most useful finding from mining the corpus was that the largest failure family is not bad code. It is epistemic: the agent acted on a stale or imagined picture of reality, then reported success without observing the live system. It deployed the wrong branch and declared victory. It declared a database column present that the live schema had never heard of. It marked a specification "verified" without running the verifier. So most of the high-value checks are reality-reconciliation mechanisms: they compare a belief-bearing artefact, the manifest, the declared schema, the claimed state, against observable ground truth, the live deployment, the actual API response, the kernel's own verdict. This is genchi genbutsu, "go and see the actual place", compiled into a cron job.
Three design choices keep such a program humane rather than tyrannical, and they are where most attempts go wrong.
First, match the enforcement strength to the blast radius. Not every check should block. A tiered model works: a small set of absolute denials for the genuinely catastrophic and perfectly decidable; a layer of hard blocks for decidable predicates with near-zero false-positive rates, where an override is possible but recorded; and a layer of warnings for the useful-but-fuzzy heuristics, which earn promotion to blocking only once their precision is proven. This is the false-alarm-versus-missed-defect trade from earlier, made operational. A gate that cries wolf is itself a defect, because a noisy alarm trains people to ignore it, and a trained-to-be-ignored alarm is worse than none. The over-zealous gate is not a stricter version of a good gate; it is a different and worse object, and it is subject to the same lifetime-value test as everything else, because it imposes exactly the c(t)
the discipline exists to minimise.
Second, every firing must teach. A good gate does not just say "denied". It says four things: what fired, with the evidence; why it exists, naming the original incident, its date, and what it cost; what to do instead, concretely; and how to override if you genuinely must. The block is not a punishment. It is the system explaining, at the moment of maximum relevance, a lesson someone already paid for once. And here Deming's correction earns its keep: the override record should be a Study artefact, a "why did we need to bypass this?" that feeds the next improvement, not an attribution ledger of who-bypassed-what. Make it the latter and you have automated the andon cord and then re-hired the time-and-motion man to stand over it; people will optimise to stay out of the record rather than to surface defects, which is the precise failure Deming spent a career warning against.
Third, the gate must be able to prove it can fire. A monitor that has never gone red is not evidence of safety; it is an untested claim, and quite possibly a piece of theatre quietly wired to something that is always green. So the rule is that a check may not guard anything until it has demonstrated both a red case and a green case. A safety mechanism that cannot demonstrably ring is not a safety mechanism.
There is a half of jidoka that gates alone do not capture, and it is the more important half. The andon stop exists to free a human to walk to the floor, see the abnormality with their own eyes, and drive it to root cause, the five whys, ending in a permanent countermeasure. A deterministic check is the output of that investigation, a countermeasure to an already-diagnosed defect, frozen into standardised work so the same walk never has to happen twice. It is not a substitute for the walk. The live, human-led loop still has to exist for novel defects: something breaks in a way no gate anticipated, a human goes and sees, finds the root cause, and the deliverable of that work is a new gate. The cron job does not replace genchi genbutsu. It records its conclusions. Skip the human investigation and keep only the automated cord and you have built exactly the "automation without the human touch" that jidoka was invented to prevent.
When you first switch such a program on, it lights up like a Christmas tree, because the debt was already there and you simply could not see it. Orphaned pointers that broke every fresh clone. A credential sitting in plaintext across dozens of remotes. A registration days from silent expiry. The burst is not the gate misbehaving; it is the gate making an invisible backlog visible all at once. Budget for it, and do not weaken the gate to quiet it, because the noise is the proof the control was needed.
The failure it forbids: Normfall #
Recall Toyota's most uncompromising rule, that a temporary countermeasure is not left in place. There is a precise failure mode it exists to prevent, the one Continuous Enforcement most needs to forbid, because it is the one agents fall into most naturally. I call it Normfall: the deviation that has become the norm.
A bug is found. The real fix is expensive or risky, so a workaround goes in, the timestamp gets coerced here, the unexpected status value gets defaulted there. The workaround works. So a second piece of code is written assuming the workaround's behaviour. Then a third. After enough time the original bug is load-bearing: the workarounds depend on it, and fixing the bug now breaks the system. The rational response is to stop trying and decide to live with it. The deviation has been promoted to architecture, passed into the codebase's folklore as just the way things are, the highest tier in the little taxonomy of defect-persistence I keep, the mythologised one.
Normfall is where the avionics regime hardens. Agents accelerate the slide for an obvious reason: a workaround is locally plausible, it makes the failing test pass, it ships in seconds, and the agent carries none of the cost it has just deferred. Every individual workaround clears review, because each one in isolation is reasonable. The defect is the pattern across them, which no single diff ever shows you. You do not catch Normfall by reviewing harder. You catch it by refusing the first temporary workaround the moment it is proposed, which is what a "no temporary countermeasures" reflex does: it treats "quick", "interim", "for now", and "stop-gap" as the dangerous words they are, and asks for the real fix, scoped and sequenced if it is large, but never deferred into folklore.
The deepest reason this works is the one Toyota understood about the rope. The cost of stopping the line now is visible, finite, and small. The cost of letting the defect flow is invisible, compounding, and eventually unpayable. Continuous Enforcement is the institutional decision to pay the small visible cost every single time, so the large invisible one never comes due.
The gate is stateless; the defect is not #
There is a hole in everything I have just said, and Normfall is where you fall into it. A gate fires on a diff. It sees the change in front of it, judges it in isolation, and passes or blocks. But Normfall is not a property of any diff. It is a property of the accumulation of diffs: each edit clears the gate, and the union of the edits is the bug. A stateless, per-round check is structurally blind to a defect that lives only in the combination, in exactly the way a per-agent safety check is blind to two agents that are each safe alone and dangerous together.
That second comparison is not a loose analogy; it is the same failure, the non-compositional kind of safety, where a property holds for every part and fails for the whole. I have been building a proof-of-concept for the multi-agent version of it, Proof-Carrying Coalitions: a set of individually safe agents whose combination reaches a capability none could reach alone, safe ∪ safe equals unsafe. The fix it formalises is to stop checking the parts and check the closure of the whole. You compute everything the combined agents can jointly reach, including the emergent capabilities that only the combination unlocks, and you ask the Lean kernel to prove that this closure never touches a forbidden set. The agents may propose; the kernel disposes. The verdict is carried by the combination, never by a member.
This is the name I promised for the Chrome-extension evening, and it is the same non-compositionality seen from the other side. Proof-Carrying Coalitions worries about the union of what agents can do, A ∪ B
, where adding capability sets can reach a forbidden one. The two rules that stranded me were the boolean dual, the intersection of what constraints permit, A ∩ B
: each instruction was satisfied by good behaviour on its own, but their conjunction left a perverse corner standing, the workaround that honoured both letters and betrayed both purposes, and an optimiser walks straight into it. Capability sets compose by union and get you more than you sanctioned; constraint sets compose by intersection and still leave room for the one behaviour you would never have sanctioned. Either way the moral is the one this essay keeps arriving at: you cannot certify a combination by checking its parts. Each agent was safe; each rule was sound; the bug lived in the ∪
and the ∩
. I want to be honest that the formalism above covers the capability-union case and not yet the instruction-intersection one, which I raise as the same shape of problem rather than a thing I have proved.
Point that lens back at one agent editing one codebase over a hundred rounds and the prescription is identical: the gate must run over the cumulative state, the closure of all edits to date against a forbidden set of emergent results, and not diff by diff. Whether edit seven quietly leaned on a guard that edit three removed is invisible in either diff and visible only in their closure. I want to be honest about the distance left to travel, because this is the part of the essay that is a research direction rather than a shipped thing. My proof-of-concept covers the monotone case, where capabilities only ever accumulate, and sequential editing is the opposite: a later edit can revoke what an earlier one assumed, which is precisely the regime the formalism does not yet handle. And it needs the same scarce thing all of Continuous Verification needs, a named, decidable invariant for the combination to preserve, which is the hard part and usually the fuzzy part. So I offer it as the shape of the answer, not the answer: the per-step gate is necessary and not sufficient, and the missing piece is a kernel-checked property of the closure of all the edits, not of any single diff.
Continuous Verification: making "done" computable, and being honest about what that buys #
Enforcement is the stop-cord. Verification is the comparator, the part of the loop that measures the gap between where the agent thinks it is and where it really is.
Start with the cheap version, because most of the value is cheap and a busy engineer should not have to stand up a theorem prover to get it. Type your boundaries so malformed data cannot cross them. Validate every input and output against a schema. Write assertions that actually run. And where a criterion is genuinely fuzzy, "is this summary faithful?", "is this tone right?", do not pretend a kernel can settle it; use a separately calibrated judge, an evaluation measured for precision and recall against human labels, which is prose-based but emphatically not self-grading. That is already a real comparator, and it needs no new toolchain. The formal layer below is optional hardening of a small, decidable surface, not the price of entry.
Now the formal layer, stated precisely about what it does and does not buy. There are two substances in software with exactly one meaning: a type-checked predicate has one meaning per the compiler, and a document has one shape per its schema. Everything else, every "verified" in a commit message, every hand-written summary of derived state, admits interpretation, and interpretations drift. So push every load-bearing, decidable fact down into a layered structure where the human-readable part is computed from the checkable part and never authored alongside it:
- The
model lives in kernel-checked predicates [a
.lean
file in my case; the substrate matters far less than the property]. Type errors and invalid proofs are caught when the kernel refuses to build. - The
values live in schema-validated data [.json
], each fact ideally carrying a small provenance triple: where it came from, when it was sourced, when it was last verified. - The state iscomputed. "Is this ready to ship?" is not a sentence someone writes; it is a decision procedure evaluated from the model and the values. It cannot drift, because nobody types it. - The view, the dashboard, the report, the prose a human reads, isgeneratedfrom the state by a committed script. Nobody hand-edits it. If the generator is correct the view cannot diverge from the truth, because there is no second copy of the truth for it to diverge from.
The thing this kills is the most innocent-looking artefact in any codebase: the hand-written status document. I will own the founding case, because it was mine to cause. Asked for a snapshot of a plan, I produced a tidy markdown file summarising a set of facts as prose. It was accurate the instant I wrote it. The objection came back fast and was not about tone: we are building a formally verified system, did you not get the memo? The point was structural. The moment that markdown existed it became a second source of truth, and the cost of every future change to the underlying facts had silently risen, because now two things had to be updated and only one of them ever would be. The overhead scales as roughly N × K
: N
copies of a fact, K
changes over time, drift on N − 1
of them all but guaranteed. The fix is not better discipline about keeping copies in sync. Discipline is a tax you pay forever and eventually miss. The fix is to have no second copy: compute the view, never curate it.
And now the concession the whole section turns on, because without it Continuous Verification is the same overclaim it accuses everyone else of. A kernel checks code against a specification a human wrote. It never checks that the specification encodes what was actually wanted. Make "done" computable and you have not abolished the human judgement; you have relocated it, from "is this code correct?" up to "is this predicate the right predicate?". The proof that the relocation is real, not rhetorical, is that I have audited my own earlier work and found hundreds of predicates that type-checked, claimed to be verified, and established nothing. The crudest form is a clause defined to be True
. But vacuity is broader and sneakier than that: a universal quantifier over an empty list is true and says nothing; a sorry
makes the build succeed with a warning, and an axiom
makes it succeed in total silence, both proving nothing while the kernel reports no error; a predicate over the wrong domain checks the wrong thing. The silent one is the more dangerous, which is the whole point. Every one of these passes the kernel and launders a false "done" in the kernel's own uniform, which is worse than an honest prose claim because it is harder to see.
So the discipline has to defend the spec the same way it defends a monitor. Extend the red-and-green rule down to predicates: a criterion counts as verified only if you can also exhibit an input on which it comes out false, a witnessed counter-model. A check that cannot fail on any input is not a strict check; it is True
wearing a costume, and it closes the empty-quantifier hole the same way demonstrating a red case closes the dead-monitor hole. With that rule in place the honest accounting is "twelve predicates proved on inputs that genuinely distinguish pass from fail, four facts tracked as prose and not counted as proved". "Sixteen verified" is a lie that a kernel was used to launder.
What is the relocation worth, then, if it does not dissolve the judgement? This: manufacturing could write its setpoint down completely, a diameter and a tolerance, and software mostly cannot, because "is this the right behaviour?" usually has no decidable form. Continuous Verification closes the loop on the decidable corner of the goal and leaves the fuzzy majority where it always was, in human judgement and the calibrated eval. That sounds like a small win until you notice what it does to the surface on which judgement must be exercised. Before, that surface was the entire sprawling, drifting, prose-and-markdown description of the system. After, it is a small set of named predicates and the explicit list of things deliberately left as prose. You still have to think. You just have far less to think about, and what is left holds still while you think about it.
Putting the two together #
Put the two together and you have closed the loop the rediscoverers left open, on the part of it that can be closed.
The agent is the actuator, powerful and loose; let it explore the whole feasible region. Continuous Verification is the comparator: on the decidable corner of the goal, the agent's claimed state is subtracted from the computed state, and the error signal comes from a kernel or a schema rather than from the agent's own self-assessment, with a witnessed counter-model proving the comparator can actually say no. Continuous Enforcement is the stop-cord and the mistake-proofing: the boundary of the feasible region is pinned by deterministic gates the agent cannot argue past, and the recurring, already-paid-for mistakes are made non-committable.
I should be precise about the word "converge", because I have leaned on it throughout and it can be cheated. Hard constraints buy boundedness, the agent cannot leave the region, which is not the same as the error being driven to zero and held there. Convergence is a property of the feedback law, the part everyone lists and nobody specifies, and the agent-reacts-to-verifier-error loop has its own stability question that anyone who has watched an agent fix one predicate and break its neighbour has seen fail: the regression limit cycle, round after round, never settling. What damps it is not magic. It is a monotone-progress rule, the ledger refuses to count a round that turned a previously-green check red, a step budget so the loop cannot thrash forever, and a human on the slow path for the genuinely novel. Specify that feedback law and "tight enough to converge" stops being a slogan and becomes a property you can hold the system to. Leave it unspecified and you are back to hoping.
This is the resolution of the loose-versus-tight tension the loop discourse keeps tripping over. You do not choose between an agent that is free and an agent that is safe. You put the freedom in the actuator and the tightness at the boundary and in the feedback law. The region is large; its edges are hard; the loop is not allowed to go backwards. Inside it, the agent roams. At its boundary, the loop holds. The boundary is not a cage on the throw. It is the bumper in a child's bowling lane, and the point of a bumper is not to cramp the swing. You bowl as hard as you like; the bumpers are how the ball still reaches the pins.
If you want the whole shape in one object, look at a self-pacing loop with an explicit stop condition rather than a self-declared one. Claude Code's /loop
is exactly that: it re-fires the same goal on each pass, lets the agent choose its own cadence, and ends not when the model feels finished but when a named condition is met. That is the right architecture, and it sharpens the only question that matters, which is what the named condition is made of. A panel of reviewers agreeing the work is good enough is a stop condition, but a qualitative one: it converges on taste, slowly, and it will not survive a thousand merges a day. For code the stop condition can be better than taste, and has to be. The loop should run until the diff provably satisfies a decidable acceptance criterion, and that criterion can sit anywhere on the ladder this essay has already climbed: a schema check or a calibrated-eval threshold at the cheap end, a product requirements document encoded in Lean at the strict end. What matters is that "done" is something evaluated against the change, not a human feeling finished and not a model declaring itself so. That is the whole point of Continuous Verification as a stop condition. It turns "the reviewers were satisfied" into "the specification is satisfied, and here is the proof", and only the second of those is something you can hand to an agent and walk away from.
One more honesty, because the verification apparatus is itself c(t)
and an essay about pricing maintenance cannot exempt its own machinery. The gates, the model, the schemas, the generators all have to be maintained, and below some threshold they are not worth it. If you have one small system, a short expected lifetime, a low blast radius, and a maintainer who will never walk away, then small diffs, good taste, and a human reviewer dominate this whole apparatus, and building it would be its own act of over-engineering. Continuous Enforcement and Continuous Verification start to clear V > 0
when the opposite holds: many systems, long lifetimes, high blast radius, and a single maintainer who will be walked away from them, by holiday, by sleep, by the next priority. That is the regime they are for, and it is increasingly the regime agentic engineering creates.
Why this lets you spend less, not more #
There is a tempting and expensive misreading of the agentic era, which is that quality is bought with scale: the biggest model, the longest context, the most tokens. Call it tokenmaxing. It is the open loop's answer to unreliability, throw more generation at the problem and hope the average comes out right, and it is the wrong answer, because an open loop does not get more correct as you feed it more. It gets more confident.
A closed loop inverts the economics. Once the comparator is real, a kernel or a schema or a calibrated eval that can actually say no, you no longer need the model to be right on the first try. You need it to be right eventually, under correction, and that is a far weaker and far cheaper requirement. A small, inexpensive model inside a tight loop, proposing, taking a hard verdict, and revising, can converge on a verified target that a frontier model asked once, with no comparator, only approximates with great confidence. You are buying correctness with verification instead of with parameters and tokens, and verification is the part that holds its value after you walk away. The frontier model is not wasted; it is spent where judgement is genuinely irreducible, on writing the spec and naming the invariants, and not on grinding out a diff a cheaper model could have produced and a kernel could have checked. The win belongs to the loop, not to the size of the model: a cheaper model is allowed to be wrong on the way to a target it cannot fake having reached. There is a floor, of course, a model too weak to ever satisfy the checks will burn its step budget and never converge, but above that floor you are paying for verification rather than for parameters, and that is the cheaper place to pay.
The strongest objections #
"This is just CI and linting with a manifesto." Partly, and I have tried to make the lineage explicit rather than hide it: CI already is software's andon cord. The differences that matter are three. A linter checks syntax against a universal style; an enforcement gate checks semantics against a specific incident that actually happened, and refuses it. A linter is bypassed with a pragma and no trace; a blocking gate's override is a logged Study artefact. And ordinary CI verifies that the code runs; Continuous Verification verifies, on the decidable corner, that the specification is satisfied, which is the harder claim, the difference between "the tests are green" and "the thing the tests were about is provably true on inputs that could have made it false". The novelty is not a half software forgot. It is a recalibration for an author that emits at machine speed and has no stake in the outcome.
"Formal verification does not scale, and most of it is theatre." The second half I concede so completely that the counter-model rule and the public confession of my own vacuous-predicate audit are built into the argument rather than bolted on. The scaling answer is that you do not formalise everything and the discipline forbids pretending you have. You formalise the comparator's decidable corner, the predicates that decide "done", the boundaries where untyped data causes the expensive failures, the handful of invariants that metastasise when violated. The irreducibly fuzzy stays prose, gets a calibrated eval, and is labelled as not proved. The goal was never a verified world. It is a verified setpoint, a far smaller surface.
"Gates will over-fit and grind the agent to a crawl." They will if you let them, and a gate that fires on the innocent is a defect against its own lifetime-value, not a stronger gate. That is why blocking strength is matched to blast radius, why the warning tier exists as a probation period, and why a gate that cries wolf is demoted, not defended. Enforcement is subject to the discipline it enforces. A loop that oscillates is not more in control than one that drifts; it is failing in the other direction.
"Agents will get good enough that none of this is necessary." This is the strongest objection, and the weak version of it, "agents will be good in general", is easy to dismiss and not worth dismissing. The strong version is specific: the model-in-a-loop insight dissolves the actuator/comparator split, because unlike a cutting tool, which cannot read its own caliper, an agent can read the spec, run the kernel, and observe the live system. The essay relies on exactly that capability when it calls genchi genbutsu "the agent going to see". So the honest question is not whether external verification survives, it is what fraction of the comparator the actuator can absorb, and how fast that fraction is rising. My answer is bounded but real: the external comparator survives for the decidable-and-catastrophic subset, the places where the actuator has generative pressure to convince itself, and you, that it has succeeded. The argument there is not about capability at all. It is about independence: the thing acting and the thing judging should not be the same process, for the same reason an auditor should not report to the executive whose numbers they audit. As the actuator absorbs more of the comparator, that catastrophic subset shrinks, but it does not vanish, and over it the separation is worth keeping however capable the model becomes. Nowhere is this sharper than in recursive self-improvement, a loop whose actuator is more capable on every turn. A strengthening actuator does not retire the independent comparator; it is the strongest argument for one, because the single thing a self-improving loop cannot be trusted to certify is itself.
The factory was always the point #
"Software factory" has been a term of mild derision for thirty years, shorthand for treating creative engineering as an assembly line. The agentic era is quietly making it literal, and the derision was always aimed at the wrong half. The problem was never the structure. It was a factory without an andon cord, without a comparator, without a rule against leaving the workaround in place, which is to say a factory that was only ever an open loop with good marketing.
We are about to run a great many coding factories, each one a loop with a language model where the cutting tool used to be. The loop, as everyone has noticed, is the easy part. It is genuinely a while-statement, and an agent genuinely is a model in a loop. The engineering is in the parts the rediscoverers leave unspecified: the setpoint you can name, the comparator that can actually say no and prove it, the cord that stops the line before the deviation becomes the norm, and the human investigation that turns each novel failure into the next gate.
Toyota built that out of people and a rope, and was careful, even as it automated, to keep the people. We get to build it out of kernels and hooks. The least we can do is admit they finished the design first, and copy the parts they were wise enough not to automate away.
Dedicated to my friend and brother Kejia Zhu, for his tireless technical curiosity and his dedication to personal growth and pathfinding.
About the author: Eduardo Aguilar Pelaez is CTO and co-founder at Legal Engine Ltd. He writes on formal methods, AI agents, and the discipline of building systems that survive being walked away from.