cd /news/ai-agents/set-a-metric-walk-away-let-the-agent… · home topics ai-agents article
[ARTICLE · art-47078] src=the-ai-corner.com ↗ pub= topic=ai-agents verified=true sentiment=↑ positive

Set a metric. Walk away. Let the agent optimize overnight.

Andrej Karpathy's 'autoresearch' technique lets AI agents optimize any metric overnight by running hundreds of experiments autonomously. One engineer applied the method to file compression, beating off-the-shelf tools for $40. The approach works by setting a metric, bounding it with constraints, and letting an agent iterate while the human sleeps.

read3 min views1 publishedJul 3, 2026
Set a metric. Walk away. Let the agent optimize overnight.
Image: The-Ai-Corner (auto-discovered)

Karpathy runs 100 experiments while he sleeps. One engineer beat off-the-shelf compression tools for $40. Here is the exact playbook to run this on any metric you care about.

There is a new way to work, and it fits in one sentence:

Pick a number, bound it with constraints, and let an agent push it while you sleep.

Karpathy calls it autoresearch. His repo gives an agent one file, one metric, and a fixed 5-minute budget per experiment. The agent edits, trains, keeps what improves, reverts what fails, and loops. Roughly 12 experiments an hour, about 100 overnight. Shopify’s CEO woke up to a model that beat his hand-tuned baseline. Karpathy’s own agent caught a bug he had missed for months.

His line is the one to keep:

“Any metric you care about that is reasonably efficient to evaluate can be autoresearched by an agent swarm. It’s worth thinking about whether your problem falls into this bucket too.”

And this pattern extends past machine learning. One engineer pointed it at file compression with Claude Code: 10 unsupervised iterations at about $4 each, and the home-built algorithm beat common tools on audio and video. Zero ML involved. Just a metric, two constraints, and a loop.

That is the whole trick, and it remains a rare skill. The gap between reading about loops and shipping one is a handful of decisions most people get wrong on the first try: which metric, which constraints, which loop mechanism, and how to stop the agent from gaming you.

Behind the paywall, the full system:

▫️

that make or break a loop, and the defaults that workThe 5 design decisions▫️

including the silent trap that bent the compression experimentThe metric-picker framework,▫️

the harness builder, the iteration prompt, and the metric auditorThe 3 copy-paste prompts:▫️

autoresearch vs Ralph loops vs /goal, /loop, and /batch, and when each winsThe tooling menu,▫️

that stop reward hacking before it startsThe constraint patterns▫️

what a loop costs per iteration and when the ROI turns positiveThe cost math,▫️

running this on conversion, latency, and content metrics with slow feedbackThe business translation,▫️

the problems where loops burn money and a human winsThe skip list,

One subscription unlocks every playbook

This is one system in a growing library. Premium opens all of them:

▫️ [Loop engineering for coding agents](https://www.the-ai-corner.com/p/loop-engineering-coding-agents-2026?r=1krivi)

▫️ [The Claude managed agents guide](https://www.the-ai-corner.com/p/claude-managed-agents-guide-2026?r=1krivi)

▫️ [The AI agent reliability playbook](https://www.the-ai-corner.com/p/ai-agent-reliability-playbook?r=1krivi)

Plus a fresh build every week. One overnight loop that lands a win pays the subscription back on the first run.

The 5 design decisions, the metric framework, the 3 prompts, the tooling menu, the anti-gaming constraints, and the cost math, in one system you can launch tonight.

Try premium free for 7 days. Or get 50% off this week only.

Get The Autoresearch Playbook below 👇

Keep reading with a 7-day free trial #

Subscribe to The AI Corner to keep reading this post and get 7 days of free access to the full post archives.

── more in #ai-agents 4 stories · sorted by recency
── more on @andrej karpathy 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/set-a-metric-walk-aw…] indexed:0 read:3min 2026-07-03 ·