03:37
2026-05-27
arxiv.org
machine-learning
FML-Bench: A Controlled Study of AI Research Agent Strategies
Researchers introduced FML-Bench, a benchmark of 18 machine learning research tasks across 10 domains, to isolate the impact of agent strategy from execution infrastructure on AI research agent perforβ¦