Cargando…

Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study

BACKGROUND: The existing dementia risk models are limited to known risk factors and traditional statistical methods. We aimed to employ machine learning (ML) to develop a novel dementia prediction model by leveraging a rich-phenotypic variable space of 366 features covering multiple domains of healt...

Descripción completa

Detalles Bibliográficos
Autores principales: You, Jia, Zhang, Ya-Ru, Wang, Hui-Fu, Yang, Ming, Feng, Jian-Feng, Yu, Jin-Tai, Cheng, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9519470/
https://www.ncbi.nlm.nih.gov/pubmed/36187723
http://dx.doi.org/10.1016/j.eclinm.2022.101665
_version_ 1784799406635089920
author You, Jia
Zhang, Ya-Ru
Wang, Hui-Fu
Yang, Ming
Feng, Jian-Feng
Yu, Jin-Tai
Cheng, Wei
author_facet You, Jia
Zhang, Ya-Ru
Wang, Hui-Fu
Yang, Ming
Feng, Jian-Feng
Yu, Jin-Tai
Cheng, Wei
author_sort You, Jia
collection PubMed
description BACKGROUND: The existing dementia risk models are limited to known risk factors and traditional statistical methods. We aimed to employ machine learning (ML) to develop a novel dementia prediction model by leveraging a rich-phenotypic variable space of 366 features covering multiple domains of health-related data. METHODS: In this longitudinal population-based cohort of the UK Biobank (UKB), 425,159 non-demented participants were enrolled from 22 recruitment centres across the UK between March 1, 2006 and October 31, 2010. We implemented a data-driven strategy to identify predictors from 366 candidate variables covering a comprehensive range of genetic and environmental factors and developed the ML model to predict incident dementia and Alzheimer's Disease (AD) within five, ten, and much longer years (median 11.9 [Interquartile range 11.2–12.5] years). FINDINGS: During a follow-up of 5,023,337 person-years, 5287 and 2416 participants developed dementia and AD, respectively. A novel UKB dementia risk prediction (UKB-DRP) model comprising ten predictors including age, ApoE ε4, pairs matching time, leg fat percentage, number of medications taken, reaction time, peak expiratory flow, mother's age at death, long-standing illness, and mean corpuscular volume was established. Our prediction model was internally evaluated based on five-fold cross-validation on discrimination and calibration, and it was further compared with existing prediction scales. The UKB-DRP model can achieve high discriminative accuracy in dementia (AUC 0.848 ± 0.007) and even better in AD (AUC 0.862 ± 0.015). The model was well-calibrated (Hosmer-Lemeshow goodness-of-fit p-value = 0.92), and the predictive power was solid in different incidence time groups. More importantly, our model presented an apparent superiority over existing models like Cardiovascular Risk Factors, Aging, and Incidence of Dementia Risk Score (AUC 0.705 ± 0.008), the Dementia Risk Score (AUC 0.752 ± 0.007), and the Australian National University Alzheimer's Disease Risk Index (AUC 0.584 ± 0.017). The model was internally validated in the general population of European ancestry and White ethnicity; thus, further validation with independent datasets is necessary to confirm these findings. INTERPRETATION: Our ML-based UKB-DRP model incorporated ten easily accessible predictors with solid predictive power for incident dementia and AD within five, ten, and much longer years, which can be used to identify individuals at high risk of dementia and AD in the general population. FUNDING: This study was funded by grants from the Science and Technology Innovation 2030 Major Projects (2022ZD0211600), National Key R&D Program of China (2018YFC1312904, 2019YFA070950), National Natural Science Foundation of China (282071201, 81971032, 82071997), Shanghai Municipal Science and Technology Major Project (2018SHZDZX01), Research Start-up Fund of Huashan Hospital (2022QD002), Excellence 2025 Talent Cultivation Program at Fudan University (3030277001), Shanghai Rising-Star Program (21QA1408700), Medical Engineering Fund of Fudan University (yg2021-013), and the 111 Project (No. B18015).
format Online
Article
Text
id pubmed-9519470
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-95194702022-09-30 Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study You, Jia Zhang, Ya-Ru Wang, Hui-Fu Yang, Ming Feng, Jian-Feng Yu, Jin-Tai Cheng, Wei eClinicalMedicine Articles BACKGROUND: The existing dementia risk models are limited to known risk factors and traditional statistical methods. We aimed to employ machine learning (ML) to develop a novel dementia prediction model by leveraging a rich-phenotypic variable space of 366 features covering multiple domains of health-related data. METHODS: In this longitudinal population-based cohort of the UK Biobank (UKB), 425,159 non-demented participants were enrolled from 22 recruitment centres across the UK between March 1, 2006 and October 31, 2010. We implemented a data-driven strategy to identify predictors from 366 candidate variables covering a comprehensive range of genetic and environmental factors and developed the ML model to predict incident dementia and Alzheimer's Disease (AD) within five, ten, and much longer years (median 11.9 [Interquartile range 11.2–12.5] years). FINDINGS: During a follow-up of 5,023,337 person-years, 5287 and 2416 participants developed dementia and AD, respectively. A novel UKB dementia risk prediction (UKB-DRP) model comprising ten predictors including age, ApoE ε4, pairs matching time, leg fat percentage, number of medications taken, reaction time, peak expiratory flow, mother's age at death, long-standing illness, and mean corpuscular volume was established. Our prediction model was internally evaluated based on five-fold cross-validation on discrimination and calibration, and it was further compared with existing prediction scales. The UKB-DRP model can achieve high discriminative accuracy in dementia (AUC 0.848 ± 0.007) and even better in AD (AUC 0.862 ± 0.015). The model was well-calibrated (Hosmer-Lemeshow goodness-of-fit p-value = 0.92), and the predictive power was solid in different incidence time groups. More importantly, our model presented an apparent superiority over existing models like Cardiovascular Risk Factors, Aging, and Incidence of Dementia Risk Score (AUC 0.705 ± 0.008), the Dementia Risk Score (AUC 0.752 ± 0.007), and the Australian National University Alzheimer's Disease Risk Index (AUC 0.584 ± 0.017). The model was internally validated in the general population of European ancestry and White ethnicity; thus, further validation with independent datasets is necessary to confirm these findings. INTERPRETATION: Our ML-based UKB-DRP model incorporated ten easily accessible predictors with solid predictive power for incident dementia and AD within five, ten, and much longer years, which can be used to identify individuals at high risk of dementia and AD in the general population. FUNDING: This study was funded by grants from the Science and Technology Innovation 2030 Major Projects (2022ZD0211600), National Key R&D Program of China (2018YFC1312904, 2019YFA070950), National Natural Science Foundation of China (282071201, 81971032, 82071997), Shanghai Municipal Science and Technology Major Project (2018SHZDZX01), Research Start-up Fund of Huashan Hospital (2022QD002), Excellence 2025 Talent Cultivation Program at Fudan University (3030277001), Shanghai Rising-Star Program (21QA1408700), Medical Engineering Fund of Fudan University (yg2021-013), and the 111 Project (No. B18015). Elsevier 2022-09-23 /pmc/articles/PMC9519470/ /pubmed/36187723 http://dx.doi.org/10.1016/j.eclinm.2022.101665 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Articles
You, Jia
Zhang, Ya-Ru
Wang, Hui-Fu
Yang, Ming
Feng, Jian-Feng
Yu, Jin-Tai
Cheng, Wei
Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study
title Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study
title_full Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study
title_fullStr Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study
title_full_unstemmed Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study
title_short Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study
title_sort development of a novel dementia risk prediction model in the general population: a large, longitudinal, population-based machine-learning study
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9519470/
https://www.ncbi.nlm.nih.gov/pubmed/36187723
http://dx.doi.org/10.1016/j.eclinm.2022.101665
work_keys_str_mv AT youjia developmentofanoveldementiariskpredictionmodelinthegeneralpopulationalargelongitudinalpopulationbasedmachinelearningstudy
AT zhangyaru developmentofanoveldementiariskpredictionmodelinthegeneralpopulationalargelongitudinalpopulationbasedmachinelearningstudy
AT wanghuifu developmentofanoveldementiariskpredictionmodelinthegeneralpopulationalargelongitudinalpopulationbasedmachinelearningstudy
AT yangming developmentofanoveldementiariskpredictionmodelinthegeneralpopulationalargelongitudinalpopulationbasedmachinelearningstudy
AT fengjianfeng developmentofanoveldementiariskpredictionmodelinthegeneralpopulationalargelongitudinalpopulationbasedmachinelearningstudy
AT yujintai developmentofanoveldementiariskpredictionmodelinthegeneralpopulationalargelongitudinalpopulationbasedmachinelearningstudy
AT chengwei developmentofanoveldementiariskpredictionmodelinthegeneralpopulationalargelongitudinalpopulationbasedmachinelearningstudy