An arXiv preprint (arXiv:2606.03995) presents an explainable machine learning approach that distinguishes normal cognition (NC), mild cognitive impairment (MCI), and Alzheimer's disease (AD) using routine clinical features from the Alzheimer's Disease Neuroimaging Initiative (ADNI). The authors trained a XGBoost classifier on 1,641 baseline subjects using eight features: MMSE, CDR Global, CDR-SB, MoCA, FAQ, age, sex, and education, per the preprint. The model used Optuna hyperparameter tuning, SMOTE for class imbalance, and SHAP for feature-level explainability. On five-fold cross-validation the preprint reports mean macro AUC 0.983 and accuracy 0.944; on a held-out test set (n = 247) it reports macro AUC 0.982 (95% CI: 0.965-0.995) and Cohen's kappa 0.909. The preprint states future work will add speech biomarkers for multimodal detection.
What happened
The arXiv preprint arXiv:2606.03995 (submitted 14 Apr 2026) reports an explainable machine learning pipeline that performs three-class classification of cognitive status using the Alzheimer's Disease Neuroimaging Initiative (ADNI) baseline data. The paper reports 1,641 baseline subjects (608 NC, 767 MCI, 266 AD) and evaluates a XGBoost model with SHAP explanations, per the preprint.
Technical details
According to the preprint, the model used eight clinical features: MMSE, CDR Global, CDR Sum of Boxes (CDR-SB), MoCA, FAQ, age, sex, and education. Hyperparameters were optimised with Optuna (50 trials) and SMOTE addressed class imbalance, the preprint states. Performance metrics reported by the authors include mean five-fold cross-validation macro AUC 0.983 (SD 0.007), accuracy 0.944 (SD 0.006), and a held-out test-set macro AUC 0.982 (95% CI: 0.965-0.995), accuracy 0.943, macro F1 0.927, balanced accuracy 0.932, and Cohen's kappa 0.909, all as reported in the preprint. The authors used SHAP values to identify feature importance, finding CDR Global dominant for NC and MCI separation and a combination of CDR-SB and MMSE driving AD classification, per the preprint.
Editorial analysis
Models achieving very high performance on ADNI are common in the literature, but industry observers note such results frequently reflect dataset characteristics, preprocessing choices, and temporal or cohort biases rather than guaranteed clinical generalizability. For practitioners, the use of XGBoost plus SHAP is a pragmatic explainability pattern that aids feature-level interpretation but does not replace prospective external validation.
Context and significance
For the clinical-ML community, the key reported contribution is a compact, explainable pipeline using routine cognitive assessments rather than imaging or molecular biomarkers. Industry context: compact models that rely on standard clinical scales lower data-collection barriers, which can accelerate initial validation in resource-constrained settings, but they also amplify the need for cross-cohort testing because measurement protocols and patient mixes vary across clinics.
What to watch
The preprint states future work will extend the framework with speech biomarkers for multimodal detection, per arXiv. For practitioners: observers should look for external validation on non-ADNI cohorts, prospective performance, and robustness checks against assessment timing, rater variability, and demographic shifts.
Scoring Rationale #
The paper reports very strong, explainable performance on a widely used research dataset (ADNI), which is notable for modelers and clinical ML teams. Impact is limited by single-dataset evaluation and the need for external prospective validation.
Practice with real Ad Tech data
90 SQL & Python problems · 15 industry datasets
[Active Search Campaigns by BudgetEasy](/problems/sql/active-search-campaigns-by-budget)
[High CPC Clicks & Poor Landing PagesMedium](/problems/sql/high-cpc-clicks-poor-landing-page)
[Campaign ROAS by Attribution ModelHard](/problems/sql/campaign-roas-by-attribution-model)
250 free problems · No credit card