An AI agent for treatment reasoning over a biomedical tool universe

wpnews.pro

cd /news/artificial-intelligence/an-ai-agent-for-treatment-reasoning-… · home › topics › artificial-intelligence › article

[ARTICLE · art-44356] src=arxiv.org ↗ pub=2026-06-30T04:00Z topic=artificial-intelligence verified=true sentiment=↑ positive

An AI agent for treatment reasoning over a biomedical tool universe

Researchers introduced ATHENA-R1, an AI agent trained via reinforcement learning over 212 biomedical tools to perform treatment reasoning across all FDA-approved drugs since 1939. The agent achieved 94.7% accuracy on drug reasoning and 82.9% on treatment reasoning, outperforming GPT-5 by 17.8 and 10.7 points respectively, and was preferred by experts from 28 rare disease organizations. Its adverse-event hypotheses, tested in electronic health records from 5.4 million patients, yielded adjusted odds ratios of 1.48-1.84 with no elevation among negative controls.

read1 min views1 publishedJun 30, 2026

arXiv:2606.28692v1 Announce Type: new Abstract: Treatment reasoning underpins every therapeutic decision, integrating disease context, comorbidities, medications, contraindications, and evolving biomedical knowledge to select an appropriate therapy. It is inherently iterative: candidates are weighed against many constraints, revised as evidence emerges, and grounded in verifiable sources. Here we introduce ATHENA-R1, an AI agent for treatment reasoning across all FDA approved drugs since 1939, trained by reinforcement learning over a universe of 212 biomedical tools. At each step it identifies missing information, selects and runs relevant tools, and incorporates the evidence. To train it without human-annotated traces, we build a two-level self-learning framework: multi-agent systems construct the tools, tasks, and reasoning trajectories for supervised fine-tuning, then reinforcement learning with scientific feedback rewards reasoning quality (evidence gathering, grounded tool use, logical non-redundancy). Across five benchmarks of 3,168 drug reasoning tasks and 456 patient treatment cases, ATHENA-R1 outperforms language models and tool-use systems, reaching 94.7% accuracy on open-ended drug reasoning and 82.9% on treatment reasoning, 17.8 and 10.7 points above GPT-5. In blinded evaluations by experts from 28 rare disease organizations, it is preferred over reference models on all criteria, and physicians rated it favorably on complex hospitalized cardiovascular and infectious-disease cases. Adverse-event hypotheses it generated, tested in electronic health records from 5.4 million patients, reached adjusted odds ratios of 1.48-1.84, with no elevation among negative controls. Because it requires knowing what evidence to seek before concluding, treatment reasoning has long been hard for AI; we show it can be reframed as a learnable process of iterative evidence gathering that reinforcement learning can train AI to perform.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/an-ai-agent-for-treatmen…

Read original on arxiv.org → arxiv.org/abs/2606.28692

mentioned entities

ATHENA-R1

GPT-5

FDA

metadata

slugan-ai-agent-for-treatment-reasoning-over-a-biomedical-tool-universe

topic#artificial-intelligence

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevShow HN: We made an Audio ML sha…

next →OpenAI ads boss David Dugan on t…

── more in #artificial-intelligence 4 stories · sorted by recency

byteiota.com · 30 Jun · #artificial-intelligence

GitHub Copilot AI Credits: What the New Billing Costs

dev.to · 30 Jun · #artificial-intelligence

The Rise of the One-Person Software Company

dev.to · 30 Jun · #artificial-intelligence

The Philosophy of a Tool: Why AI Will Never Take Over and Hidden Guardrails Are Literal Evil

arxiv.org · 30 Jun · #artificial-intelligence

Labeling Training Data for Entity Matching Using Large Language Models

── more on @athena-r1 3 stories trending now

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 29 Jun · #ai-agents

I built 25 executable skills for AI coding agents �“ all open source

wpnews · 29 Jun · #large-language-models

The Silent Cost of AI Agents: Why Your Next.js SaaS Is Burning Money on LLM Calls

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required