cd /news/machine-learning/informative-missingness-to-generate-… · home topics machine-learning article
[ARTICLE · art-30545] src=arxiv.org ↗ pub= topic=machine-learning verified=true sentiment=↑ positive

Informative Missingness to Generate Irregular Clinical Time Series

Researchers developed a diffusion-based model that generates irregular clinical time series by jointly modeling laboratory values and their observation patterns, using the MIMIC-III-derived DACMI benchmark. The approach captures clinically meaningful dependencies between patient physiology and clinicians' testing behavior under missing-not-at-random conditions, serving as a step toward clinical foundation models.

read1 min views1 publishedJun 17, 2026

arXiv:2606.17106v1 Announce Type: new Abstract: Laboratory tests in electronic health records are collected irregularly, and the absence of a test order can be as informative as the measurement itself. Such missingness reflects clinicians' decisions and patient physiology, making it important to model it directly rather than treat it as a preprocessing artifact. Here we present a diffusion-based approach for generating clinical time series that jointly models laboratory values and their observation patterns using the public Data Analytics Challenge on Missing Data Imputation (DACMI) benchmark derived from MIMIC-III. To preserve realistic sampling, we align chart times into 4-hour intervals and segment admissions into 7-day windows, producing trajectories that pair each lab value with a corresponding observation indicator. Standard transformations and normalization are applied to stabilize training. Our method extends the TimeDiff framework to learn continuous lab values and discrete missingness patterns through complementary diffusion objectives. Experiments show that the generated data closely match real patient trajectories across individual lab distributions and joint value-missingness embeddings, demonstrating that diffusion models can capture clinically meaningful dependencies between patient physiology and clinicians' testing behavior under MNAR-like (missing-not-at-random) missingness. These preliminary results indicate that our model can serve as an initial component toward developing clinical foundation models. By producing synthetic priors that preserve key physiology-missingness relationships, this work motivates the subsequent training of Prior-Data Fitted Networks capable of leveraging informative missingness, which we will investigate in the extended work.

── more in #machine-learning 4 stories · sorted by recency
── more on @mimic-iii 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/informative-missingn…] indexed:0 read:1min 2026-06-17 ·