{"slug": "researchers-detect-hidden-self-harm-histories-with-ml", "title": "Researchers detect hidden self-harm histories with ML", "summary": "Researchers at the University of New Mexico School of Medicine analyzed electronic health records of over 1.3 million Veterans Health Administration patients and found that standard diagnosis codes captured self-harm history in only 1.85% of patients. A machine learning method combined with expert chart review estimated documented self-harm in 7.9% of patients, more than four times higher, according to a study published in the Journal of Medical Internet Research. The findings indicate that relying solely on diagnosis codes may substantially underestimate the need for mental health services among veterans.", "body_md": "# Researchers detect hidden self-harm histories with ML\n\nThe University of New Mexico School of Medicine led a study analyzing electronic health records for more than **1.3 million** patients served by the Veterans Health Administration, according to a UNM press release. The researchers reported that diagnosis codes captured **1.85%** of patients with documented self-harm history, while their machine learning method estimated documented self-harm in **7.9%** of patients, over four times higher, per the study published in the Journal of Medical Internet Research (reported by News-Medical and UNM). The team combined a novel machine learning approach with expert chart review and statistical calibration, the UNM release states. The study also found that among veterans with a diagnosis code for self-harm, only **22.6%** had self-harm listed on the VHA problem list. \"For research and planning, if we only count what is easy to see in diagnosis codes, we may substantially underestimate the need for mental health services,\" said Christophe Lambert, PhD, per the UNM release.\n\n### What happened\n\nThe University of New Mexico School of Medicine led a study that analyzed electronic health records for more than **1.3 million** patients seen in the Veterans Health Administration, according to a UNM press release and reporting in News-Medical. The researchers reported that standard diagnosis codes identified self-harm history in **1.85%** of patients, while their calibrated machine learning method estimated documented self-harm in **7.9%** of patients, a gap of more than fourfold, per the study published in the Journal of Medical Internet Research. The study also reported that among veterans with a diagnosis code for self-harm, **22.6%** had self-harm or a history of self-harm listed on the VHA problem list, the UNM release states.\n\n### Technical details\n\nEditorial analysis - technical context: The authors applied a previously developed machine learning approach and then adjusted results with expert chart review and statistical calibration, reporting a higher detected prevalence than diagnosis codes alone. Public reporting does not provide model architecture, feature sets, or performance metrics beyond the calibrated prevalence estimates; the UNM release and News-Medical article do not disclose whether natural language processing, structured-data features, or hybrid methods were used. For practitioners, combining automated detection with human review and calibration is a common pattern in EHR phenotyping to control for label noise and documentation bias.\n\n### Context and significance\n\nThe paper highlights a recurring measurement problem in clinical informatics: diagnostic coding and problem lists underrepresent clinically documented conditions. Accurate ascertainment of past self-harm matters because prior self-harm is a strong predictor of future suicide risk and influences resource planning, quality measurement, and risk stratification. Studies that rely solely on billing codes will likely undercount prevalence, which can bias research estimates and operational planning, a point the authors and the UNM release emphasize.\n\n### What to watch\n\nObservers should look for:\n\n- •peer-reviewed details in the Journal of Medical Internet Research article about model inputs and performance metrics\n- •replication or external validation in non-VHA systems to assess portability across different EHR vendors and documentation cultures\n- •how health systems balance automated detection with privacy, governance, and clinical workflow integration. UNM has not provided full technical reproducibility materials in the press coverage cited; readers should consult the published article for code or data availability statements\n\n### Bottom line\n\nThe reported results quantify a substantial visibility gap between documentation found in diagnosis codes and what a calibrated ML approach detects in chart records, with implications for researchers and health systems that use EHR data for surveillance, risk prediction, and planning.\n\n## Scoring Rationale\n\nThis is a notable, practice-relevant application demonstrating measurement gaps in EHR-derived phenotypes and a pragmatic ML-plus-chart-review methodology. It is important for clinical informatics and risk-prediction work but is not a frontier-model breakthrough.\n\nPractice with real Ad Tech data\n\n90 SQL & Python problems · 15 industry datasets\n\n[Active Search Campaigns by BudgetEasy](/problems/sql/active-search-campaigns-by-budget)\n\n[High CPC Clicks & Poor Landing PagesMedium](/problems/sql/high-cpc-clicks-poor-landing-page)\n\n[Campaign ROAS by Attribution ModelHard](/problems/sql/campaign-roas-by-attribution-model)\n\n250 free problems · No credit card\n\n[See all Ad Tech problems](/problems/datasets/adtech)", "url": "https://wpnews.pro/news/researchers-detect-hidden-self-harm-histories-with-ml", "canonical_source": "https://letsdatascience.com/news/researchers-detect-hidden-self-harm-histories-with-ml-04eced15", "published_at": "2026-06-06 04:50:22.671853+00:00", "updated_at": "2026-06-06 04:50:25.408861+00:00", "lang": "en", "topics": ["machine-learning", "artificial-intelligence", "ai-research", "ai-ethics"], "entities": ["University of New Mexico School of Medicine", "Veterans Health Administration", "Christophe Lambert", "Journal of Medical Internet Research"], "alternates": {"html": "https://wpnews.pro/news/researchers-detect-hidden-self-harm-histories-with-ml", "markdown": "https://wpnews.pro/news/researchers-detect-hidden-self-harm-histories-with-ml.md", "text": "https://wpnews.pro/news/researchers-detect-hidden-self-harm-histories-with-ml.txt", "jsonld": "https://wpnews.pro/news/researchers-detect-hidden-self-harm-histories-with-ml.jsonld"}}