University of Cambridge and Nvidia unveil co-evolution framework for AI agents and evaluators

wpnews.pro

cd /news/artificial-intelligence/university-of-cambridge-and-nvidia-u… · home › topics › artificial-intelligence › article

[ARTICLE · art-43011] src=cryptobriefing.com ↗ pub=2026-06-29T05:09Z topic=artificial-intelligence verified=true sentiment=· neutral

University of Cambridge and Nvidia unveil co-evolution framework for AI agents and evaluators

Researchers from the University of Cambridge and Nvidia introduced the Red Queen Gödel Machine (RQGM), a framework enabling AI agents and their evaluators to co-evolve, addressing stagnation in recursive self-improvement. Preliminary results show improvements in scientific paper acceptance rates, mathematical proof accuracy, and coding efficiency, though the paper has not yet been peer-reviewed.

read3 min views1 publishedJun 29, 2026

University of Cambridge and Nvidia unveil co-evolution framework for AI agents and evaluators — Image: Cryptobriefing (auto-discovered)

The Red Queen Gödel Machine lets AI systems and their judges improve together, tackling a longstanding stagnation problem in recursive self-improvement

Here’s a fundamental problem with building AI that can improve itself: the thing grading the homework never gets any smarter. A static evaluator eventually becomes the bottleneck. A new research framework from the University of Cambridge and Nvidia aims to fix that by letting both the AI agent and its evaluator evolve in tandem.

The preprint paper, titled “The Red Queen Gödel Machine: Co-Evolving Agents and Their Evaluators,” was submitted on June 24 by a team of 13 authors spanning Cambridge, Nvidia, Flower Labs, MBZUAI, and Inria. The core idea is deceptively simple: if AI agents keep getting better but their evaluators stay frozen in place, progress stalls. So make them evolve together.

How RQGM actually works #

The framework, abbreviated RQGM, introduces what the researchers call “epoch-based controlled utility evolution.” The system runs in discrete rounds where both the AI doing the work and the AI judging the work get upgraded simultaneously.

This is a direct evolution of Jürgen Schmidhuber’s 2003 Gödel Machine concept, which proposed AI systems that could rewrite their own code using formal mathematical proofs. That original idea was elegant on paper but largely impractical in the real world. The new RQGM model swaps out formal proofs for something more organic: Darwinian mutation and iterative co-evolution.

The preliminary results are noteworthy across several domains. Acceptance rates for co-evolved writers in scientific paper submissions jumped by 1.78x to 1.86x when evaluated by diverse AI judge panels. Co-evolved graders showed a 9% accuracy improvement on Olympiad-level mathematical proofs. And coding benchmarks demonstrated a 1.35x to 1.72x reduction in tokens used, suggesting that co-evolved systems don’t just perform better, they perform more efficiently.

Why static evaluators are the real bottleneck #

Previous approaches relied on fixed benchmarks and static evaluation criteria. The AI would optimize against those criteria, hit the ceiling of what the evaluator could measure, and then plateau. By allowing evaluators to co-evolve alongside the agents they’re judging, RQGM creates a moving target that prevents this kind of gaming.

The researchers themselves flag alignment concerns about what happens when ground-truth metrics—the supposedly objective benchmarks anchoring the system—start influencing the evolutionary trajectory. If the ground truth itself is flawed or biased, co-evolution could amplify those flaws rather than correct them.

What this means for investors watching AI infrastructure #

The paper hasn’t undergone peer review yet, and the findings are described as preliminary empirical investigations. That’s a meaningful disclaimer.

The coding efficiency gains alone—reducing token usage by up to 1.72x—suggest meaningful cost reductions for companies running large language model inference at scale.

The alignment concerns raised in the paper also deserve investor attention. As AI systems gain the ability to modify their own evaluation criteria, regulatory scrutiny will almost certainly follow.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our

Editorial Policy.

source & further reading

cryptobriefing.com — original article OpenAI delays GPT-5.6 release as US seeks early access to AI models China’s factory activity likely returns to growth in June as PMI inches above the expansion line TSMC accelerates local DRAM supply chain with Winbond collaboration

~/api · this article 200

$curl api.wpnews.pro/v1/news/university-of-cambridge-…

Read original on cryptobriefing.com → cryptobriefing.com/cambridge-nvidia-rqgm-ai-co-e…

mentioned entities

University of Cambridge

Nvidia

Flower Labs

MBZUAI

Inria

Jürgen Schmidhuber

Red Queen Gödel Machine

metadata

sluguniversity-of-cambridge-and-nvidia-unveil-co-evolution-framework-for-ai-agents

topic#artificial-intelligence

secondary3 topics

sentimentneutral

canonicalcryptobriefing.com

navigation

← prevHow I built a Milvus ALTER comma…

next →Asian shares are mixed as tech s…

── more in #artificial-intelligence 4 stories · sorted by recency

aiworkflowreliability.com · 29 Jun · #artificial-intelligence

Why Your Production RAG System Slowly Gets Worse

ca.finance.yahoo.com · 29 Jun · #artificial-intelligence

Asian shares are mixed as tech stocks fall in Japan and South Korea

efn.se · 29 Jun · #artificial-intelligence

Dragning nedåt i Asien efter helgens oroligheter

independent.co.uk · 29 Jun · #artificial-intelligence

Nvidia's AI chip sales in China stall, as local chipmakers like Huawei take the lead

── more on @university of cambridge 3 stories trending now

wpnews · 28 May · #ai-startups

[AINews] Cognition raises $1B in $26B Series D

wpnews · 5 Jun · #ai-agents

Miasma Worm Targets AI Coding Agents via GitHub Repos

wpnews · 28 Jun · #ai-agents

OpenCode v1.17: Session Snapshots Undo Your AI Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required