Set a metric. Walk away. Let the agent optimize overnight.

wpnews.pro

cd /news/ai-agents/set-a-metric-walk-away-let-the-agent… · home › topics › ai-agents › article

[ARTICLE · art-47078] src=the-ai-corner.com ↗ pub=2026-07-03T16:54Z topic=ai-agents verified=true sentiment=↑ positive

Set a metric. Walk away. Let the agent optimize overnight.

Andrej Karpathy's 'autoresearch' technique lets AI agents optimize any metric overnight by running hundreds of experiments autonomously. One engineer applied the method to file compression, beating off-the-shelf tools for $40. The approach works by setting a metric, bounding it with constraints, and letting an agent iterate while the human sleeps.

read3 min views1 publishedJul 3, 2026

Set a metric. Walk away. Let the agent optimize overnight. — Image: The-Ai-Corner (auto-discovered)

Karpathy runs 100 experiments while he sleeps. One engineer beat off-the-shelf compression tools for $40. Here is the exact playbook to run this on any metric you care about.

There is a new way to work, and it fits in one sentence:

Pick a number, bound it with constraints, and let an agent push it while you sleep.

Karpathy calls it autoresearch. His repo gives an agent one file, one metric, and a fixed 5-minute budget per experiment. The agent edits, trains, keeps what improves, reverts what fails, and loops. Roughly 12 experiments an hour, about 100 overnight. Shopify’s CEO woke up to a model that beat his hand-tuned baseline. Karpathy’s own agent caught a bug he had missed for months.

His line is the one to keep:

“Any metric you care about that is reasonably efficient to evaluate can be autoresearched by an agent swarm. It’s worth thinking about whether your problem falls into this bucket too.”

And this pattern extends past machine learning. One engineer pointed it at file compression with Claude Code: 10 unsupervised iterations at about $4 each, and the home-built algorithm beat common tools on audio and video. Zero ML involved. Just a metric, two constraints, and a loop.

That is the whole trick, and it remains a rare skill. The gap between reading about loops and shipping one is a handful of decisions most people get wrong on the first try: which metric, which constraints, which loop mechanism, and how to stop the agent from gaming you.

Behind the paywall, the full system:

▫️

that make or break a loop, and the defaults that workThe 5 design decisions▫️

including the silent trap that bent the compression experimentThe metric-picker framework,▫️

the harness builder, the iteration prompt, and the metric auditorThe 3 copy-paste prompts:▫️

autoresearch vs Ralph loops vs /goal, /loop, and /batch, and when each winsThe tooling menu,▫️

that stop reward hacking before it startsThe constraint patterns▫️

what a loop costs per iteration and when the ROI turns positiveThe cost math,▫️

running this on conversion, latency, and content metrics with slow feedbackThe business translation,▫️

the problems where loops burn money and a human winsThe skip list,

One subscription unlocks every playbook

This is one system in a growing library. Premium opens all of them:

▫️ [Loop engineering for coding agents](https://www.the-ai-corner.com/p/loop-engineering-coding-agents-2026?r=1krivi)

▫️ [The Claude managed agents guide](https://www.the-ai-corner.com/p/claude-managed-agents-guide-2026?r=1krivi)

▫️ [The AI agent reliability playbook](https://www.the-ai-corner.com/p/ai-agent-reliability-playbook?r=1krivi)

Plus a fresh build every week. One overnight loop that lands a win pays the subscription back on the first run.

The 5 design decisions, the metric framework, the 3 prompts, the tooling menu, the anti-gaming constraints, and the cost math, in one system you can launch tonight.

Try premium free for 7 days. Or get 50% off this week only.

Get The Autoresearch Playbook below 👇

Keep reading with a 7-day free trial #

Subscribe to The AI Corner to keep reading this post and get 7 days of free access to the full post archives.

source & further reading

the-ai-corner.com — original article Brian Armstrong Runs 1,200 AI Agents at Coinbase. Here Is the Operating Model He Just Handed Every Founder. The Real Reason AI Costs Keep Rising OpenAI's $122B masterclass: 10 takeaways from Sarah Friar

~/api · this article 200

$curl api.wpnews.pro/v1/news/set-a-metric-walk-away-l…

Read original on the-ai-corner.com → www.the-ai-corner.com/p/autoresearch-playbook-ag…

mentioned entities

Andrej Karpathy

Shopify

Claude Code

The AI Corner

metadata

slugset-a-metric-walk-away-let-the-agent-optimize-overnight

topic#ai-agents

secondary3 topics

sentimentpositive

canonicalthe-ai-corner.com

navigation

← prev'Degenerative Brain Problem': Ne…

next →The AI Coding Maturity Scale: Th…

── more in #ai-agents 4 stories · sorted by recency

blog.codacy.com · 3 Jul · #ai-agents

The AI Coding Maturity Scale: The Path to Loop Engineering

dissenter.com · 3 Jul · #ai-agents

Zuckerberg: Meta AI Fails, But Employee Surveillance Works Flawlessly

sourcefeed.dev · 3 Jul · #ai-agents

Alibaba’s Claude Code Ban Exposes the Agentic Security Paradox

blog.herlein.com · 3 Jul · #ai-agents

A Chainsaw at an Axe-Throwing Contest: My Current Agentic Loop

── more on @andrej karpathy 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 1 Jul · #ai-infrastructure

My Notes After Databricks Data and AI Summit 2026

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required