Small LLMs for Biomedical Claim Verification: Cost-Effective Fine-Tuning, Structural Dataset Shortcuts, and Cross-Domain Generalization

wpnews.pro

cd /news/large-language-models/small-llms-for-biomedical-claim-veri… · home › topics › large-language-models › article

[ARTICLE · art-24825] src=arxiv.org ↗ pub=2026-06-12T04:00Z topic=large-language-models verified=true sentiment=↑ positive

Small LLMs for Biomedical Claim Verification: Cost-Effective Fine-Tuning, Structural Dataset Shortcuts, and Cross-Domain Generalization

Researchers fine-tuned three small language models—Phi-3-mini, Qwen2.5-3B, and Mistral-7B—using QLoRA on biomedical claim verification datasets, finding that Mistral-7B outperformed GPT-4o and GPT-5 by up to 12% F1 at a fraction of the cost with only 1,008 training examples. The study identified a previously unreported structural artifact in the SciFact dataset that inflated in-domain scores, and demonstrated that training on structurally sound data enabled robust cross-domain generalization. The findings suggest that cost-effective fine-tuning of small models can match or exceed large proprietary systems for biomedical claim verification, with all code and adapter checkpoints to be released.

read1 min views30 publishedJun 12, 2026

arXiv:2606.12854v1 Announce Type: new Abstract: Large Language Models such as GPT-4o and GPT-5 achieve strong zero-shot performance on biomedical claim verification, but cost and opacity limit scalable use. We fine-tune three small LLMs: Phi-3-mini (3.8B), Qwen2.5-3B, and Mistral-7B, via QLoRA on SciFact and HealthVer, providing the first study of QLoRA models against GPT-4o and fine-tuned BioLinkBERT encoders. Mistral-7B QLoRA surpasses both GPT-4o and GPT-5 (up to 12% F1 gain) at a fractional cost using just 1,008 training examples. We conduct extensive in-domain and cross-domain evaluation: models trained on SciFact tested on HealthVer and vice versa, at matched sizes to isolate dataset structure from data quantity. We identify a previously unreported structural artifact in SciFact that inflates in-domain scores, and show through bidirectional out-of-domain evaluation that training on structurally sound data enables robust cross-domain transfer. We plan to release all code and adapter checkpoints.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/small-llms-for-biomedica…

Read original on arxiv.org → arxiv.org/abs/2606.12854

mentioned entities

GPT-4o

GPT-5

Phi-3-mini

Qwen2.5-3B

Mistral-7B

SciFact

HealthVer

BioLinkBERT

metadata

slugsmall-llms-for-biomedical-claim-verification-cost-effective-fine-tuning-dataset

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevLinear Coding Sessions

next →Can KKR Outmaneuver One of the B…

── more in #large-language-models 4 stories · sorted by recency

github.com · 28 Jul · #large-language-models

Show HN: I put a $2.43 necklace on 3 outfits. VLMs priced it at $19 to $104

dev.to · 28 Jul · #large-language-models

OpenAI Expands GPT-Live ChatGPT Voice to Enterprise Workspaces Worldwide

promptcube3.com · 28 Jul · #large-language-models

how to fix Cursor connection failed error

promptcube3.com · 28 Jul · #large-language-models

AI Proposal Writer: A Two-Stage Prompt Engineering Guide

── more on @gpt-4o 3 stories trending now

wpnews · 26 Jul · #artificial-intelligence

Nobel laureate Simon Johnson on the AI race and China’s ‘over-automation’ problem

wpnews · 26 Jul · #artificial-intelligence

China’s Moonshot, Z.AI, and DeepSeek are challenging U.S. AI labs—and beating them on cost

wpnews · 26 Jul · #ai-safety

University of Washington study reveals prompt injection risks lurking in AI agent memory

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required