EVOM: Agentic Meta-Evolution of Actor-Critic Architectures for Reinforcement Learning

wpnews.pro

cd /news/machine-learning/evom-agentic-meta-evolution-of-actor… · home › topics › machine-learning › article

[ARTICLE · art-40287] src=arxiv.org ↗ pub=2026-06-26T04:00Z topic=machine-learning verified=true sentiment=↑ positive

EVOM: Agentic Meta-Evolution of Actor-Critic Architectures for Reinforcement Learning

Researchers introduced EVOM, an agentic meta-evolution framework that uses an LLM-based design agent to automate the discovery of high-performance actor-critic architectures for reinforcement learning. EVOM outperformed manually designed baselines and state-of-the-art methods on Ant-v4 and HalfCheetah-v4 tasks.

read1 min views1 publishedJun 26, 2026

arXiv:2606.26327v1 Announce Type: new Abstract: In actor-critic reinforcement learning, network architectures are typically manually designed. Automating this design is challenging because each candidate must be trained before evaluation, and the design space is open-ended. To address these challenges, we introduce EVOM, an agentic meta-evolution framework for discovering high-performance actor-critic architectures. We frame architecture search as a bi-level optimization: an inner loop trains weights via the low-fidelity proximal policy optimization (PPO), while an outer loop drives meta-evolution by iteratively refining architecture programs. Crucially, this outer loop is powered by an LLM-based design agent that operates purely as an architecture designer, completely decoupled from policy execution and environment control. Experiments reveal that EVOM outperforms the manually designed baseline, an LLM-guided random search, and the state-of-the-art LLM-guided programmatic policy search method MLES, delivering superior performance on Ant-v4 and HalfCheetah-v4. Ablation studies validate that both the meta-evolution loop and the LLM Design Agent are indispensable for final performance.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/evom-agentic-meta-evolut…

Read original on arxiv.org → arxiv.org/abs/2606.26327

mentioned entities

EVOM

PPO

MLES

Ant-v4

HalfCheetah-v4

metadata

slugevom-agentic-meta-evolution-of-actor-critic-architectures-for-reinforcement

topic#machine-learning

secondary2 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevHo progettato un'infrastruttura …

next →Cannes Briefing: Creativity is m…

── more in #machine-learning 4 stories · sorted by recency

arxiv.org · 26 Jun · #machine-learning

Where Larger Models Excel: The Primacy of Constraint-Guided Reasoning

arxiv.org · 26 Jun · #machine-learning

Dynamic-dLLM: Dynamic Cache-Budget and Adaptive Parallel Decoding for Training-Free Acceleration of Diffusion LLM

arxiv.org · 26 Jun · #machine-learning

GeMoE: Gating Entropy is All You Need for Uncertainty-aware Adaptive Routing in MoE-based Large Vision-Language Models

byungkwanlee.github.io · 20 Jun · #machine-learning

Nvidia-ZPPO: Zone of Proximal Policy Optimization

── more on @evom 3 stories trending now

wpnews · 19 Oct · #developer-tools

Windows Script to clean up and remove all ASUS software

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 1 Nov · #developer-tools

Custom Zig Test Runner, better ouput, timing display, and support for special "tests:beforeAll" and "tests:afterAll" tests

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required