Stanford's Merlin puts vision-language AI on full 3D CT scans

wpnews.pro

cd /news/artificial-intelligence/stanford-s-merlin-puts-vision-langua… · home › topics › artificial-intelligence › article

[ARTICLE · art-47154] src=runtimewire.com ↗ pub=2026-07-03T05:16Z topic=artificial-intelligence verified=true sentiment=↑ positive

Stanford's Merlin puts vision-language AI on full 3D CT scans

Stanford researchers led by Louis Blankemeier, Ashwin Kumar and Akshay S. Chaudhari published Merlin, a 3D vision-language foundation model for CT scans, in Nature on March 4. The model was trained on over 6 million images from 15,331 CT scans and evaluated on 752 tasks including disease prediction and report generation, addressing the gap of 2D-focused medical AI models.

read1 min views1 publishedJul 3, 2026

Stanford's Merlin puts vision-language AI on full 3D CT scans — Image: Runtimewire (auto-discovered)

Louis Blankemeier, Ashwin Kumar and Akshay S. Chaudhari's Stanford-led team published Merlin, a 3D vision-language foundation model for computed tomography, in a March 4 Nature paper that takes aim at one of radiology AI's practical gaps: most medical vision-language models have been built around 2D images and shorter text, while CT interpretation is volumetric, text-heavy and tied to patient history.

Merlin was trained on paired abdominal CT scans, diagnosis codes and radiology reports, using more than 6 million images from 15,331 CT scans, more than 1.8 million diagnosis codes and more than 6 million report tokens in the training set. The researchers evaluated the model on 6 task types and 752 individual tasks, including zero-shot findings classification, phenotype classification, image-report retrieval, 5-year chronic disease prediction, radiology report generation and 3D organ segmentation.

source & further reading

runtimewire.com — original article CoRL 2026 will put robot learning's founder pipeline in Austin this November Braygent's Fable 5 take puts token budgets ahead of context-window hype Z.ai launches ZCode to turn GLM-5.2 into a coding-agent wedge

~/api · this article 200

$curl api.wpnews.pro/v1/news/stanford-s-merlin-puts-v…

Read original on runtimewire.com → runtimewire.com/article/stanford-merlin-3d-ct-vi…

mentioned entities

Stanford

Louis Blankemeier

Ashwin Kumar

Akshay S. Chaudhari

Merlin

Nature

metadata

slugstanford-s-merlin-puts-vision-language-ai-on-full-3d-ct-scans

topic#artificial-intelligence

secondary4 topics

sentimentpositive

canonicalruntimewire.com

navigation

← prevI Stopped Waiting for the Perfec…

next →Kioxia ships samples of new flas…

── more in #artificial-intelligence 4 stories · sorted by recency

koreatimes.co.kr · 3 Jul · #artificial-intelligence

Why a Mongolian computer engineering student is leaving Korea for China

pub.towardsai.net · 3 Jul · #artificial-intelligence

If You Use AI in 2026, You Should Understand These 17 Concepts

letsdatascience.com · 3 Jul · #artificial-intelligence

Smartail Partners with Nexus Hub to Expand AI Education

dev.to · 1 Jul · #artificial-intelligence

AetherCut Hardware acceleration.

── more on @stanford 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 1 Jul · #ai-infrastructure

My Notes After Databricks Data and AI Summit 2026

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required