cd /news/large-language-models/llm-autoscilab-closed-loop-scientifi… · home topics large-language-models article
[ARTICLE · art-14014] src=arxiv.org pub= topic=large-language-models verified=true sentiment=↑ positive

LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs

Researchers have developed LLM-AutoSciLab, a closed-loop scientific discovery framework that uses large language models to iteratively generate hypotheses, select informative experiments, and refine mechanisms through active data acquisition rather than passive analysis of fixed datasets. The system, evaluated on new benchmarks ActiveSciBench-Chem and ActiveSciBench-GRN, achieved up to 67.6% symbolic accuracy on NewtonBench and demonstrated 2-5x greater sample efficiency than competing methods. The approach shifts scientific discovery from static inference to adaptive experimentation, addressing the challenge of resolving uncertainty when limited observations support multiple plausible mechanisms.

read1 min publishedMay 26, 2026

arXiv:2605.24043v1 Announce Type: new Abstract: Scientific discovery is a closed-loop process in which hypotheses guide data acquisition and observations refine the hypothesis space. Yet most approaches reduce discovery to supervised learning over fixed datasets, where limited observations can support multiple plausible mechanisms that fit locally but fail to generalize. Thus, the key challenge is selecting informative observations to resolve uncertainty, shifting the focus from static inference to adaptive data acquisition. To address this, we propose LLM-AutoSciLab, a closed-loop framework that couples hypothesis generation with hypothesis-conditioned experiment selection and mechanism refinement. Rather than fitting models to passively collected data, LLM-AutoSciLab iteratively proposes plausible hypotheses, selects informative experiments to distinguish or refine them, and updates its state using the resulting evidence. To evaluate dynamic, closed-loop scientific discovery with active data acquisition, we introduce ActiveSciBench, comprising two datasets: ActiveSciBench-Chem with 57 enzyme-kinetics tasks and ActiveSciBench-GRN with 45 gene-regulatory-network tasks. These datasets model discovery as a budget-constrained process requiring adaptive experiment design, variable selection, and recovery of true mechanisms. Across NewtonBench, ActiveSciBench-Chem, and ActiveSciBench-GRN, LLM-AutoSciLab outperforms prior methods, achieving 67.6% and 35.1% symbolic accuracy on NewtonBench and ActiveSciBench-Chem, respectively, and 31.1% exact graph recovery on ActiveSciBench-GRN. Moreover, hypothesis-guided experimentation is 2-5x more sample-efficient than the strongest competing baselines. Code and data are available at: https://github.com/scientific-discovery/LLM-AutoSciLab

── more in #large-language-models 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/llm-autoscilab-close…] indexed:0 read:1min 2026-05-26 ·