{"slug": "evom-agentic-meta-evolution-of-actor-critic-architectures-for-reinforcement", "title": "EVOM: Agentic Meta-Evolution of Actor-Critic Architectures for Reinforcement Learning", "summary": "Researchers introduced EVOM, an agentic meta-evolution framework that uses an LLM-based design agent to automate the discovery of high-performance actor-critic architectures for reinforcement learning. EVOM outperformed manually designed baselines and state-of-the-art methods on Ant-v4 and HalfCheetah-v4 tasks.", "body_md": "arXiv:2606.26327v1 Announce Type: new\nAbstract: In actor-critic reinforcement learning, network architectures are typically manually designed. Automating this design is challenging because each candidate must be trained before evaluation, and the design space is open-ended. To address these challenges, we introduce EVOM, an agentic meta-evolution framework for discovering high-performance actor-critic architectures. We frame architecture search as a bi-level optimization: an inner loop trains weights via the low-fidelity proximal policy optimization (PPO), while an outer loop drives meta-evolution by iteratively refining architecture programs. Crucially, this outer loop is powered by an LLM-based design agent that operates purely as an architecture designer, completely decoupled from policy execution and environment control. Experiments reveal that EVOM outperforms the manually designed baseline, an LLM-guided random search, and the state-of-the-art LLM-guided programmatic policy search method MLES, delivering superior performance on Ant-v4 and HalfCheetah-v4. Ablation studies validate that both the meta-evolution loop and the LLM Design Agent are indispensable for final performance.", "url": "https://wpnews.pro/news/evom-agentic-meta-evolution-of-actor-critic-architectures-for-reinforcement", "canonical_source": "https://arxiv.org/abs/2606.26327", "published_at": "2026-06-26 04:00:00+00:00", "updated_at": "2026-06-26 04:18:11.726200+00:00", "lang": "en", "topics": ["machine-learning", "large-language-models", "ai-research"], "entities": ["EVOM", "PPO", "MLES", "Ant-v4", "HalfCheetah-v4"], "alternates": {"html": "https://wpnews.pro/news/evom-agentic-meta-evolution-of-actor-critic-architectures-for-reinforcement", "markdown": "https://wpnews.pro/news/evom-agentic-meta-evolution-of-actor-critic-architectures-for-reinforcement.md", "text": "https://wpnews.pro/news/evom-agentic-meta-evolution-of-actor-critic-architectures-for-reinforcement.txt", "jsonld": "https://wpnews.pro/news/evom-agentic-meta-evolution-of-actor-critic-architectures-for-reinforcement.jsonld"}}