LLMs Extract Drug Discontinuations From Estonian EHRs

wpnews.pro

cd /news/large-language-models/llms-extract-drug-discontinuations-f… · home › topics › large-language-models › article

[ARTICLE · art-31794] src=letsdatascience.com ↗ pub=2026-06-17T21:53Z topic=large-language-models verified=true sentiment=· neutral

LLMs Extract Drug Discontinuations From Estonian EHRs

Researchers combined prescription records with free-text anamneses from a 10% sample of the Estonian population (2012-2019) and applied Llama-3.1-70B and GPT-4o to extract drug discontinuation events and reasons for statins and antidiabetic medications, demonstrating LLM utility for pharmacoepidemiology in a low-resource language.

read3 min views32 publishedJun 17, 2026

Per a JMIR preprint by Suvalov et al., researchers combined prescription records with free-text anamneses from a 10% sample of the Estonian population (2012-2019) to identify drug discontinuation events and their reasons. The study applied Llama-3.1-70B and GPT-4o to extract discontinuation phrases, map them into a clinician-developed taxonomy, and label who initiated the stoppage; performance was evaluated on 100 randomly selected cases per drug group (statins and antidiabetic medications), according to the preprint. This work demonstrates a practical application of LLMs to a low-resource language for pharmacoepidemiology, highlighting both potential gains for large-scale adherence research and the need for careful validation on clinical free text.

What happened

Per a JMIR preprint by Suvalov et al., the authors merged prescription data with free-text clinical anamneses from a 10% sample of the Estonian population covering 2012-2019. The study targeted discontinuations for statins and antidiabetic medications and applied two large language models, Llama-3.1-70B and GPT-4o, to:

•extract discontinuation phrases
•classify reasons using a clinician-developed taxonomy
•identify whether the patient or clinician initiated the discontinuation. Performance was measured on 100 randomly chosen cases per drug group, as reported in the preprint

Technical details

The preprint documents using Llama-3.1-70B and GPT-4o for information extraction and classification from Estonian-language clinical notes. The authors developed a taxonomy of discontinuation reasons with clinician input and applied the models to link free-text evidence to structured prescription records. The manuscript presents validation on a held-out sample; exact performance metrics are reported in the preprint.

Context and significance

Applying LLMs to extract clinically relevant events from free text addresses a long-standing barrier in pharmacoepidemiology: important discontinuation rationale is frequently recorded only in narrative notes. Systems that successfully pair prescriptions with extracted reasons can enable higher-fidelity signal detection for side effects, inefficacy, or access barriers. A concurrent Harvard / Brigham and Women's Hospital preprint (arXiv 2506.11137) covers the same problem on English EHR datasets, demonstrating that LLM-based medication status extraction scales without human annotation - reinforcing the broader applicability of this approach.

What to watch

Observers should watch for the peer-reviewed final JMIR publication for full performance metrics and error analysis, replication on other languages or EHR systems, and whether the authors publish the taxonomy, annotation guidelines, or evaluation code to enable reproducibility. External replication and transparent error breakdowns (false positives versus false negatives, initiator misclassification) will determine practical utility for downstream clinical research.

Scoring Rationale #

A solid niche preprint demonstrating LLM application to pharmacoepidemiology in a low-resource (Estonian) language, using population-scale prescription and free-text EHR data. Relevant to clinical NLP and pharmacoepidemiology practitioners but limited by single-country scope, small evaluation set (100 cases per drug group), and preprint status pending peer review.

Practice with real Ad Tech data

90 SQL & Python problems · 15 industry datasets

[Active Search Campaigns by BudgetEasy](/problems/sql/active-search-campaigns-by-budget)

[High CPC Clicks & Poor Landing PagesMedium](/problems/sql/high-cpc-clicks-poor-landing-page)

[Campaign ROAS by Attribution ModelHard](/problems/sql/campaign-roas-by-attribution-model)

250 free problems · No credit card

See all Ad Tech problems

source & further reading

letsdatascience.com — original article South Korean Firms Announce Reported $950B AI Chip Partnerships Okta Index Shows Enterprises Mixing AI Platforms OpenAI Cuts GPT-5.6 Luna and Terra Prices

~/api · this article 200

$curl api.wpnews.pro/v1/news/llms-extract-drug-discon…

Read original on letsdatascience.com → letsdatascience.com/news/llms-extract-drug-disco…

mentioned entities

Suvalov

Llama-3.1-70B

GPT-4o

JMIR

Estonia

metadata

slugllms-extract-drug-discontinuations-from-estonian-ehrs

topic#large-language-models

secondary2 topics

sentimentneutral

canonicalletsdatascience.com

navigation

← prevDoctors Affirm Irreplaceability …

next →Fashion Retail Adopts AI Across …

── more in #large-language-models 4 stories · sorted by recency

dev.to · 1 Aug · #large-language-models

Your Voice Assistant Can Be Social-Engineered Too, and Nobody's Watching For It

promptcube3.com · 31 Jul · #large-language-models

few-shot prompting tips, AI translation tools compared

dev.to · 2 Aug · #large-language-models

Tokenization in AI: What Is It, Why Is It Used, and How Does It Work?

github.com · 1 Aug · #large-language-models

TextGrad: Automatic "Differentiation" via Text

── more on @suvalov 3 stories trending now

wpnews · 1 Aug · #ai-products

OpenAI Atlas Shuts Down August 9: Migration Guide

wpnews · 1 Aug · #ai-agents

Quality Isn't Accidental — Maker/Checker Separation and Automated Validation

wpnews · 1 Aug · #developer-tools

I Built a Portable AI Skill That Safely Upgrades .NET Applications

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required