Towards Conversational AI for Disease Management

wpnews.pro

cd /news/large-language-models/towards-conversational-ai-for-diseas… · home › topics › large-language-models › article

[ARTICLE · art-31287] src=nature.com ↗ pub=2026-06-17T15:08Z topic=large-language-models verified=true sentiment=↑ positive

Towards Conversational AI for Disease Management

Google's AMIE, an LLM-based agentic system, outperformed primary care physicians in management reasoning and medication prescription in a randomized blinded study of 100 multi-visit clinical scenarios. The system demonstrated non-inferiority to doctors in overall management reasoning and scored better in treatment precision and guideline alignment, marking a step toward conversational AI for disease management.

read3 min views30 publishedJun 17, 2026

Abstract #

While large language models (LLMs) have shown promise in diagnostic dialogue1, their capabilities for effective management reasoning—including disease progression, therapeutic response, and safe medication prescription—remain under-explored. We advance the previously demonstrated diagnostic capabilities of the Articulate Medical Intelligence Explorer (AMIE)1−3 through a new LLM-based agentic system optimized for multi-visit clinical management and dialogue. To ground its reasoning in authoritative clinical knowledge, AMIE leverages Gemini’s long-context capabilities4, combining in-context retrieval with structured reasoning to align its output with up-to-date clinical practice guidelines and drug formularies. In a randomized, blinded virtual Objective Structured Clinical Examination (OSCE) study, AMIE was compared to 21 primary care physicians (PCPs) across 100 multi-visit case scenarios designed to reflect UK NICE Guidance and BMJ Best Practice guidelines. AMIE was non-inferior to PCPs in management reasoning as assessed by specialists and scored better in both preciseness of treatments and investigations, and in its alignment with and grounding in clinical guidelines. To benchmark medication reasoning, we developed RxQA, a multiple-choice question benchmark derived from two national drug formularies (US, UK) and validated by board-certified pharmacists. Though AMIE and PCPs both benefited from the ability to access external drug information, AMIE outperformed PCPs on higher difficulty questions. While further research would be needed before real-world translation, AMIE’s strong performance across evaluations marks a significant step towards conversational AI as a tool in disease management.

This is a preview of subscription content, [access via your institution](https://wayf.springernature.com?redirect_uri=https%3A%2F%2Fwww.nature.com%2Farticles%2Fs41586-026-10764-5)

Access options #

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

27,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 52 print issues and online access

199,00 € per year

only 3,83 € per issue

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Author information #

Authors and Affiliations

Corresponding authors

Supplementary information #

Supplementary Information (download PDF ) Supplementary discussion, methods and results (Sections 1-16). Contains related work, details on the system design for the Mx agent and Dialogue agent, details on the OSCE evaluation study (inter-rater reliability analysis, clinician metadata, scenario metadata, ablation analysis), and methods details and further results for the RxQA medication reasoning benchmark.

Supplementary Data 1 (download PDF ) Detailed view of two sample scenarios with AMIE and PCP output and evaluation gradings. Full details for two sample scenarios used in the OSCE evaluation study, including scenario information, AMIE-patient-actor conversations, PCP-patient-actor conversations, specialist physician gradings and patient actor gradings for all three visits per scenario.

Supplementary Data 2 (download PDF ) Details for all 120 OSCE scenarios with AMIE output (PDF). Scenario details and AMIE output for all 120 scenarios used either in the OSCE evaluation study (100) or for validation purposes (20), in human-readable PDF format.

Supplementary Data 3 (download CSV ) Details for all 120 OSCE scenarios with AMIE output (CSV). Scenario details and AMIE output for all 120 scenarios used either in the OSCE evaluation study (100) or for validation purposes (20), in machine-readable CSV format.

Rights and permissions #

About this article #

Cite this article

Liévin, V., Palepu, A., Weng, WH. et al. Towards Conversational AI for Disease Management.

Nature (2026). https://doi.org/10.1038/s41586-026-10764-5 Received:

Accepted:

Published:

DOI: https://doi.org/10.1038/s41586-026-10764-5

source & further reading

nature.com — original article The genetic architecture of fibromyalgia across 2.5M individuals Neuroprosthesis restores hand movement&sensation in human w complete tetraplegia AI 'Raygun' can shrink and supersize proteins – enabling easy editing

~/api · this article 200

$curl api.wpnews.pro/v1/news/towards-conversational-a…

Read original on nature.com → www.nature.com/articles/s41586-026-10764-5

mentioned entities

Google

AMIE

Gemini

NICE

BMJ

RxQA

metadata

slugtowards-conversational-ai-for-disease-management

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalnature.com

navigation

← prevHIVE Digital Technologies aims f…

next →Cosmicgpt – A GPT-in-space simul…

── more in #large-language-models 4 stories · sorted by recency

the-decoder.com · 18 Jun · #large-language-models

AI systems rival doctors in new Nature studies, but one result suggests the tech won't age well

dev.to · 2 Aug · #large-language-models

RAG Is Not Enough: What Retrieval Still Gets Wrong in 2026

unusualwhales.com · 2 Aug · #large-language-models

DeepMind Disbands AlphaFold Team, Pivots to Gemini

dev.to · 2 Aug · #large-language-models

Five things I noticed this week: GPT-5.6, Gemini Robotics 2, and GitHub stacked PRs

── more on @google 3 stories trending now

wpnews · 1 Aug · #ai-products

OpenAI Atlas Shuts Down August 9: Migration Guide

wpnews · 1 Aug · #ai-agents

Quality Isn't Accidental — Maker/Checker Separation and Automated Validation

wpnews · 1 Aug · #developer-tools

I Built a Portable AI Skill That Safely Upgrades .NET Applications

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required