arXiv

Early Detection of Alzheimer's Disease Using Explainable Machine Learning on Clinical Biomarkers: A Multi-Class Classification Study Using the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset

Title: Leveraging Explainable Machine Learning and Clinical Biomarkers for Early Alzheimer’s Detection: A Multi-Class Analysis of the ADNI Dataset

Abstract

Background: Alzheimer’s disease (AD) currently impacts more than 55 million individuals globally. There is a pressing, unmet need for detection methods that are both accurate and interpretable to distinguish between normal cognition (NC), mild cognitive impairment (MCI), and AD based on standard clinical evaluations.

Methods: This study developed an XGBoost classifier designed for three-class identification, utilizing eight clinical variables sourced from the Alzheimer's Disease Neuroimaging Initiative (ADNI): Mini-Mental State Examination (MMSE), Clinical Dementia Rating (CDR) Global, CDR Sum of Boxes (CDR-SB), Montreal Cognitive Assessment (MoCA), Functional Activities Questionnaire (FAQ), age, sex, and education level. To handle class imbalance, Synthetic Minority Over-sampling Technique (SMOTE) was applied, while hyperparameters were tuned via Optuna over 50 trials. Model performance was assessed using macro AUC-ROC (with 95% confidence intervals derived from 1,000 bootstrap iterations), macro F1 score, balanced accuracy, and Cohen’s kappa. Additionally, SHAP values were employed to provide explainability at the feature level.

Results: The analysis included 1,641 subjects at baseline, distributed as 608 with NC, 767 with MCI, and 266 with AD. In five-fold cross-validation, the model achieved a mean macro AUC of 0.983 (SD 0.007), an accuracy of 0.944 (SD 0.006), and a macro F1 of 0.929 (SD 0.008). Performance on the independent test set (n = 247) yielded a macro AUC of 0.982 (95% CI: 0.965–0.995), accuracy of 0.943, balanced accuracy of 0.932, macro F1 of 0.927, and Cohen’s kappa of 0.909. SHAP interpretation highlighted that CDR Global was the primary predictor for distinguishing NC and MCI, whereas CDR-SB and MMSE were the key drivers for classifying AD.

Conclusion: An explainable machine learning approach, trained on routine clinical metrics, demonstrates near-perfect capability in three-class Alzheimer’s detection. The SHAP analysis confirms clinically plausible, class-specific feature importance, reinforcing the model's validity. Subsequent research aims to integrate speech biomarkers into this framework to enable multimodal detection.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)
Bloomberg

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)

SpaceX aims for a record $75 billion valuation through an initial public offering. This historic IPO marks a significant...

Broadcom AI Chip Outlook Disappoints Investors
Bloomberg

Broadcom AI Chip Outlook Disappoints Investors

Broadcom’s AI chip projections disappointed investors, dampening market sentiment. The outlook fell short of expectation...

Hiranandani Group CEO on Powering India's Digital Future
Bloomberg

Hiranandani Group CEO on Powering India's Digital Future

Hiranandani Group CEO discusses driving India's digital transformation.

Cerebras Says It’s Working With All AI Gear Makers Except Nvidia
Bloomberg

Cerebras Says It’s Working With All AI Gear Makers Except Nvidia

Cerebras confirmed partnerships with all major AI hardware vendors except Nvidia. This broad engagement positions Cerebr...

Putin Turns Russia’s AI Future Into a Kremlin Family Business
Bloomberg

Putin Turns Russia’s AI Future Into a Kremlin Family Business

Putin is consolidating Russia’s AI ambitions into a Kremlin family business, effectively turning the sector into a dynas...

Reuters

Meta repeatedly pushes back new AI model release for developers, WSJ says

Meta has repeatedly delayed the release of its new AI model for developers, according to the WSJ. This ongoing postponem...