From Residuals to Reasons: LLM-Guided Mechanism Inference from Tabular Data

wpnews.pro

cd /news/machine-learning/from-residuals-to-reasons-llm-guided… · home › topics › machine-learning › article

[ARTICLE · art-13548] src=arxiv.org ↗ pub=2026-05-25T04:00Z topic=machine-learning verified=true sentiment=↑ positive

From Residuals to Reasons: LLM-Guided Mechanism Inference from Tabular Data

Researchers have developed Multi-Agent Residual In-Context Learning (MARICL), a framework that uses large language model agents to analyze where a base statistical model fails and generate explicit correction terms from high-residual examples. Tested across nine scientific and socioeconomic benchmarks, MARICL consistently improved predictions, and frozen formulas from one experimental batch of the Cell-Free Protein dataset improved predictions in over 92% of held-out batches under the same protocol. The framework's success boundary aligned with underlying biochemistry rather than batch identity, providing direct evidence of mechanistic generalization rather than noise fitting.

read1 min views5 publishedMay 25, 2026

arXiv:2605.22897v1 Announce Type: new Abstract: A persistent challenge in machine learning for scientific applications is jointly achieving prediction and understanding. Statistical models excel on structured data but operate as black boxes, while existing interpretability methods are largely inspective: they answer "which features matter?" but do not articulate how features interact or refine explanations iteratively alongside human understanding. Asking an LLM to predict the target directly forces it to search the entire output space; we instead anchor predictions with a base model and ask the LLM the narrower question of what that model is missing. We introduce Multi-Agent Residual In-Context Learning (MARICL), an agentic framework in which LLM agents analyze where a base-model fails, hypothesize missing structure from high-residual examples provided in context, and produce explicit correction terms refined through multi-turn textual gradient optimization. Across nine benchmarks spanning scientific, biomedical, socioeconomic, and synthetic settings, MARICL improves consistently over its base model on all datasets. To test whether these corrections reflect real structure or batch-specific noise, we freeze formulas learned on one experimental batch of the Cell-Free Protein dataset and apply them (with no retraining and no further LLM calls) to held-out batches. Within the same reagent protocol, the frozen formulas improve predictions in over 92% of cases; across a different protocol, they fail systematically. The success boundary aligns with the biochemistry, not the batch count; direct evidence of mechanistic generalization.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/from-residuals-to-reason…

Read original on arxiv.org → arxiv.org/abs/2605.22897

mentioned entities

MARICL

Cell-Free Protein

metadata

slugfrom-residuals-to-reasons-llm-guided-mechanism-inference-from-tabular-data

topic#machine-learning

secondary3 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevThe Eternal Sloptember

next →Samsung memory workers call off …

── more in #machine-learning 4 stories · sorted by recency

blog.n.ichol.ai · 17 Jul · #machine-learning

The Doctor Is Not the Mother: DS4 Latent Reasoning

dev.to · 17 Jul · #machine-learning

I gave my agent the right memory and it ignored it anyway

alejandromp.com · 17 Jul · #machine-learning

Have you built an agent harness yet?

dev.to · 17 Jul · #machine-learning

Why I Built Local-First Agent Memory

── more on @maricl 3 stories trending now

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #large-language-models

Gemini 3.5 Pro Delayed to July 17: Architectural Rebuild Explained

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required