AGENTS.md is becoming the new code review contract

wpnews.pro

GitHub added a small Copilot code review feature this week that feels bigger than the changelog entry.

Copilot code review can now read repository-level AGENTS.md

instructions.

That sounds like a nice quality-of-life improvement. Put your preferences in a file. Tell the agent how the project works. Get fewer weird review comments. Fine.

But I think the more interesting version is this: code review is starting to depend on machine-readable engineering judgment.

Not only style rules. Not only lint. Not only "please use pnpm."

Actual team taste.

The little rules that senior engineers carry around in their heads. The migration scars. The local architecture boundaries. The places where the codebase looks flexible but really is not. The patterns that are tolerated in one folder and forbidden in another. The test strategy that makes sense only if you know the history of the service.

For years, that knowledge lived in code reviews as repeated comments from tired humans. Now we are being asked to write it down for agents.

Good.

Also uncomfortable.

The easiest part of code review to automate is the part we should have automated already.

Formatting. Dead imports. Missing null checks. Basic security mistakes. Naming that violates a clear convention. Tests that obviously do not run. Dependencies that should not be added.

Those are useful checks, but they are not the reason code review matters.

The hard part of review is judgment.

Does this change belong in this layer? Is this abstraction premature? Is this behavior compatible with the migration we are halfway through? Is this the right place to pay down debt, or is it a distraction from the actual risk? Does this test prove the thing users care about, or only the implementation we happen to have today?

Humans answer those questions with context.

Some of that context is in the repository. Some is in docs. Some is in tickets. Some is in the memory of the person who has reviewed every painful refactor since 2021.

Agents can read a lot, but they are not automatically part of that memory.

AGENTS.md

is one way to give them a map.

There is a tempting bad version of this.

A team creates an AGENTS.md

file that says things like:

This is better than nothing in the same way a motivational poster is better than a blank wall.

It does not create a review contract.

A useful agent instruction file should be more local and more opinionated. It should say the things that are true here, in this repository, for this team, because of the system you actually maintain.

For example: That is the useful stuff.

It is not universal. It is not glamorous. It is the team telling the agent where the rails are.

And once Copilot code review reads those instructions, they stop being documentation that maybe someone remembers. They become part of the review surface.

The awkward question is who gets to write the file.

If AGENTS.md influences automated review comments, then it is not just a developer convenience. It is part of the engineering control plane.

That means it needs ownership.

Not heavy bureaucracy. Please no.

But the file should not become a dumping ground for every frustrated reviewer to encode their personal preference. It should not become a prompt-shaped style guide with 300 rules nobody agrees with. It should not be rewritten casually by the same pull request it is supposed to constrain.

The best version probably looks like any other important repository policy:

This is where the "agents replace reviewers" story gets too shallow.

Agents do not remove human judgment. They make the written parts of human judgment more valuable.

If the team cannot explain what it wants, the agent will mostly learn the easy surface: syntax, file names, nearby patterns, and generic advice from the internet. That may be enough for small changes.

It is not enough for the weird parts of real systems.

We have been here before, just with smaller tools.

Linters turned some taste into executable rules. Formatters ended whole categories of review comments. Type systems moved mistakes earlier. CI made "works on my machine" less persuasive. Policy-as-code moved some operational rules out of meetings and into checks.

Each step changed code review.

The reviewer stopped spending time on semicolons and started spending more time on behavior. Or at least that was the promise.

Agent instructions are a similar move, but less deterministic.

A linter either reports a rule violation or it does not. An agent reads an instruction, mixes it with code context, model behavior, and whatever else is in the prompt, then produces a comment that may or may not be useful.

So we should not pretend AGENTS.md

is the same as a test suite.

It is softer than that.

But soft does not mean useless.

Engineering organizations already run on soft contracts: architecture principles, design review norms, escalation rules, ownership boundaries, deploy expectations, and the informal "we do not do that here" knowledge every healthy team has.

The difference is that agents need those soft contracts in writing.

There is a failure mode I expect to see a lot.

The agent starts leaving confident comments based on stale or vague instructions.

It tells people not to use a pattern that is now approved. It repeats a rule that only applied during a migration that ended months ago. It blocks a reasonable local exception because the file says "never." It comments on every pull request with the same generic architecture sermon.

That will make developers hate the tool quickly.

The fix is not to abandon repository instructions. The fix is to treat them as living code.

If an instruction produces bad review comments, change the instruction. If the instruction is correct but the agent applies it poorly, make it narrower. If a rule has exceptions, name the exceptions. If the rule is actually preference dressed up as architecture, remove it. This is boring maintenance work.

That is why it matters.

The teams that get value from AI review will not be the teams with the longest instruction files. They will be the teams with the clearest ones.

One thing I would like to see more of in AI-assisted review is provenance.

If Copilot leaves a comment because of AGENTS.md , say that.

Point to the instruction. Let the author and reviewer see which local rule was involved. Make it easy to tell the difference between a generic model concern and a repository-specific contract.

That matters because humans need to debug the system.

When a human reviewer gives bad feedback, you can talk to the human. When an agent gives bad feedback, you need to know which part of the system produced it: the model, the code context, the prompt, the repo instructions, a stale doc, or a missing exception.

Without that trail, teams will either trust the comments too much or ignore them entirely.

Neither is good.

Good AI review should feel less like a mysterious second reviewer and more like a visible extension of the team's own standards.

AGENTS.md

support in Copilot code review is a small feature with a serious implication.

Repositories are becoming places where teams encode not only code, tests, and configuration, but also instructions for non-human collaborators.

That is the right direction.

The codebase should explain itself to the tools that work inside it. The review agent should know more than generic best practices. It should know the local contracts that make this system maintainable.

But this only works if teams take the file seriously.

Write down the judgment you actually want repeated. Keep it short. Keep it local. Review changes to it with the same care you give other policy files. Remove stale rules. Prefer concrete constraints over vague taste. Watch whether the comments get better.

The model is not going to magically learn your engineering culture from folder names.

If you want agents to review like members of the team, you have to give them the team's standards in a form they can use. That is what AGENTS.md

is becoming.

Not a prompt.

A review contract.

To test my projects, I use Railway. If you want $20 USD to get started, use this link.

source & further reading

dev.to — original article Context Architecture: the day I realized the whole repo is the context Context Architecture: el día que entendí que el repo entero es el contexto GameJam Cipher

AGENTS.md is becoming the new code review contract

Run your AI side-project on zahid.host