LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs

wpnews.pro

cd /news/large-language-models/llm-autoscilab-closed-loop-scientifi… · home › topics › large-language-models › article

[ARTICLE · art-14014] src=arxiv.org ↗ pub=2026-05-26T04:00Z topic=large-language-models verified=true sentiment=↑ positive

LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs

Researchers have developed LLM-AutoSciLab, a closed-loop scientific discovery framework that uses large language models to iteratively generate hypotheses, select informative experiments, and refine mechanisms through active data acquisition rather than passive analysis of fixed datasets. The system, evaluated on new benchmarks ActiveSciBench-Chem and ActiveSciBench-GRN, achieved up to 67.6% symbolic accuracy on NewtonBench and demonstrated 2-5x greater sample efficiency than competing methods. The approach shifts scientific discovery from static inference to adaptive experimentation, addressing the challenge of resolving uncertainty when limited observations support multiple plausible mechanisms.

read1 min views4 publishedMay 26, 2026

arXiv:2605.24043v1 Announce Type: new Abstract: Scientific discovery is a closed-loop process in which hypotheses guide data acquisition and observations refine the hypothesis space. Yet most approaches reduce discovery to supervised learning over fixed datasets, where limited observations can support multiple plausible mechanisms that fit locally but fail to generalize. Thus, the key challenge is selecting informative observations to resolve uncertainty, shifting the focus from static inference to adaptive data acquisition. To address this, we propose LLM-AutoSciLab, a closed-loop framework that couples hypothesis generation with hypothesis-conditioned experiment selection and mechanism refinement. Rather than fitting models to passively collected data, LLM-AutoSciLab iteratively proposes plausible hypotheses, selects informative experiments to distinguish or refine them, and updates its state using the resulting evidence. To evaluate dynamic, closed-loop scientific discovery with active data acquisition, we introduce ActiveSciBench, comprising two datasets: ActiveSciBench-Chem with 57 enzyme-kinetics tasks and ActiveSciBench-GRN with 45 gene-regulatory-network tasks. These datasets model discovery as a budget-constrained process requiring adaptive experiment design, variable selection, and recovery of true mechanisms. Across NewtonBench, ActiveSciBench-Chem, and ActiveSciBench-GRN, LLM-AutoSciLab outperforms prior methods, achieving 67.6% and 35.1% symbolic accuracy on NewtonBench and ActiveSciBench-Chem, respectively, and 31.1% exact graph recovery on ActiveSciBench-GRN. Moreover, hypothesis-guided experimentation is 2-5x more sample-efficient than the strongest competing baselines. Code and data are available at: https://github.com/scientific-discovery/LLM-AutoSciLab

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/llm-autoscilab-closed-lo…

Read original on arxiv.org → arxiv.org/abs/2605.24043

mentioned entities

LLM-AutoSciLab

ActiveSciBench

ActiveSciBench-Chem

ActiveSciBench-GRN

NewtonBench

metadata

slugllm-autoscilab-closed-loop-scientific-discovery-via-active-experimentation-with

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevShow HN: Self-hosted collaborati…

next →Google Enters The Ecommerce Wars…

── more in #large-language-models 4 stories · sorted by recency

artificialanalysis.ai · 16 Jul · #large-language-models

Kimi K3 beats GPT 5.6 Sol in agentic knowledge work

dev.to · 16 Jul · #large-language-models

Physics-Augmented Diffusion Modeling for planetary geology survey missions across multilingual stakeholder groups

rootly.com · 16 Jul · #large-language-models

What Doom taught us about AI-assisted incident response

cryptobriefing.com · 17 Jul · #large-language-models

Robinhood Chain hits $100M in agent trading volume two weeks after launch

── more on @llm-autoscilab 3 stories trending now

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #ai-chips

D-Matrix launches Corsair AI inference platform, challenging Nvidia’s GPU dominance

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required