Feedback Latency Is the Agent's IQ

wpnews.pro

The same agent, same prompts, did markedly different work on two codebases I work in. One has a test suite that runs in eight seconds. The other takes twelve minutes. The eight-second project gets a careful, iterative collaborator. The twelve-minute project gets a confident guesser.

I noticed it first as a vibe. The agent in the slow codebase would write five files at once, then announce the task complete without having run anything end to end. The agent in the fast codebase would write one function, run the tests, react to the failure, fix it, run them again. Same model. Same configuration. The only difference was how expensive it was to learn whether the previous step was right.

That is the whole post in one sentence. An agent's effective intelligence is bounded by how fast it can verify its hypotheses. Cut the verification cost and you raise the agent's apparent IQ. Raise it and you lower the agent's apparent IQ. The model in the middle is unchanged.

A human engineer can hold a hypothesis in their head. "I think this works. I will check it later." The cost of holding the hypothesis is roughly free; the human has institutional memory, intuition, a sense of what the code does that does not require running the code to confirm. They can defer verification without losing fidelity.

An agent cannot. It has no intuition about your codebase. The only ground truth it has access to is what the tests say, what the type checker says, what the build says. When those signals are cheap, the agent uses them constantly. When they are expensive, the agent stops using them and starts speculating.

Speculation by an agent looks plausible. It produces code that compiles, follows the patterns it has seen in your repository, names things sensibly. The problem is that plausible is not the same as correct. The agent that speculates is shipping a guess; the agent that iterates is shipping a tested answer. From the diff alone, they can be hard to tell apart.

This is the insight that took me too long to internalize. Slow feedback does not produce a slower agent. It produces a less honest one.

Some rough buckets, from the agent's behavior on tasks I have watched closely.

Sub-second feedback (lint, type check, a small unit test on save) gives you an agent that operates like a careful TDD practitioner. It edits, it checks, it edits, it checks. The unit of work is a single change. Mistakes get caught before they leave the function they were made in.

A few seconds to a minute (a fast unit suite) gives you an agent that batches changes into small commits and runs the suite between them. Mistakes get caught within the file or the module. The agent corrects course frequently and visibly.

Five to ten minutes (a typical integration suite) gives you an agent that runs the suite at the start, makes a chunk of changes, runs it once more at the end, and crosses its fingers in between. Mistakes get caught only at boundaries. Subtle regressions inside the chunk sometimes slip past because the agent did not get the granular feedback that would have flagged them.

Twenty minutes or more (a slow CI loop, or a suite the agent only runs locally because nothing else is fast enough) gives you the speculator. The agent runs the suite once, maybe twice in the whole session, and otherwise reasons from the code without verifying. Mistakes get caught in review, in QA, or in production.

The breakpoints are not exact. The shape is real. Every order of magnitude of latency you remove from the loop is an order of magnitude of fidelity you give back to the agent.

The case for fast tests used to go like this. Developers will run them more if they are fast. Faster feedback catches bugs earlier. Slow tests are skipped, and skipped tests are dead code. All of that is still true.

There is a new term in the equation. Your agent is now one of the consumers of the test suite. A test suite optimized for the agent looks the same as a test suite optimized for the developer, only more so. The bar is no longer "fast enough that developers will run it". The bar is "fast enough that the agent will run it between every meaningful change".

That bar is much lower. A developer might tolerate a two-minute suite. An agent will gladly run a two-second suite a hundred times in an hour. The investments that get you there (port-and-adapter seams, in-memory fakes for everything that does not need a database, a hard line between unit and integration tiers) are the same investments I argued for in a piece about cutting CI time. The argument is the same; the consumer is different; the payoff compounds.

A short list of the things that actually move the needle.

Dependency inversion in production code. Every external boundary (database, mailer, HTTP client, queue) sits behind an interface owned by the application. In tests, the interface gets an in-memory implementation. The test process does not boot Docker; it does not migrate a schema; it does not open a socket. The cost of a test drops to milliseconds.

A test pyramid that earns its confidence. Unit tests, the cheap and numerous layer, run on every change. Integration tests, the few that genuinely need infrastructure, run separately and less often. Mixing them is what produces the suite that takes ten minutes when it could take ten seconds.

Pre-commit hooks that catch the silly stuff. Lint and type errors should fail in milliseconds, not after a full test run. The agent should never spend a CI cycle learning that it forgot a semicolon. That is what the local hook is for.

A clean separation of test scopes. When the agent can run only the tests for the module it is touching, every cycle of the loop is faster. The agent will choose the narrowest scope it can defend, and ship more confident work as a result.

No shared state between tests. Parallelism is the cheapest speedup you have not taken yet. A suite that can run across all your cores is one that the agent can iterate against without burning your patience.

Before you tune your prompts, time your test suite. Before you write more rules, count the seconds it takes to learn whether the last change worked. The fastest path to a better agent in most codebases is not a bigger model or a richer harness. It is a feedback loop short enough that the agent can afford to use it.

The model is doing the best it can with what you give it. What you give it, mostly, is how fast you let it learn.

source & further reading

dev.to — original article I Built a Graveyard for My Dead Side Projects - With AI Eulogies & a 3D Cemetery 🧩 Runtime Snapshots #19 - We Opened the Format. Heirloom AI - Preserve family memory

Feedback Latency Is the Agent's IQ

Run your AI side-project on zahid.host