Explainable machine learning aggregates polygenic risk scores and electronic health records for Alzheimer's disease prediction.

TitleExplainable machine learning aggregates polygenic risk scores and electronic health records for Alzheimer's disease prediction.
Publication TypeJournal Article
Year of Publication2023
AuthorsGao XRaymond, Chiariglione M, Qin K, Nuytemans K, Scharre DW, Li Y-J, Martin ER
JournalSci Rep
Volume13
Issue1
Pagination450
Date Published01/2023
ISSN2045-2322
KeywordsAdult, Aged, Alzheimer Disease, Electronic Health Records, Humans, Machine Learning, Risk Factors
Abstract

Alzheimer's disease (AD) is the most common late-onset neurodegenerative disorder. Identifying individuals at increased risk of developing AD is important for early intervention. Using data from the Alzheimer Disease Genetics Consortium, we constructed polygenic risk scores (PRSs) for AD and age-at-onset (AAO) of AD for the UK Biobank participants. We then built machine learning (ML) models for predicting development of AD, and explored feature importance among PRSs, conventional risk factors, and ICD-10 codes from electronic health records, a total of > 11,000 features using the UK Biobank dataset. We used eXtreme Gradient Boosting (XGBoost) and SHapley Additive exPlanations (SHAP), which provided superior ML performance as well as aided ML model explanation. For participants age 40 and older, the area under the curve for AD was 0.88. For subjects of age 65 and older (late-onset AD), PRSs were the most important predictors. This is the first observation that PRSs constructed from the AD risk and AAO play more important roles than age in predicting AD. The ML model also identified important predictors from EHR, including urinary tract infection, syncope and collapse, chest pain, disorientation and hypercholesterolemia, for developing AD. Our ML model improved the accuracy of AD risk prediction by efficiently exploring numerous predictors and identified novel feature patterns.

DOI10.1038/s41598-023-27551-1
Alternate JournalSci Rep
PubMed ID36624143
PubMed Central IDPMC9829871
Grant ListMC_PC_17228 / MRC_ / Medical Research Council / United Kingdom
U01 AG032984 / AG / NIA NIH HHS / United States
RF1 AG060472 / AG / NIA NIH HHS / United States