Cargando…

Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study

Machine learning (ML) could have advantages over traditional statistical models in identifying risk factors. Using ML algorithms, our objective was to identify the most important variables associated with mortality after dementia diagnosis in the Swedish Registry for Cognitive/Dementia Disorders (Sv...

Descripción completa

Detalles Bibliográficos
Autores principales: Mostafaei, Shayan, Hoang, Minh Tuan, Jurado, Pol Grau, Xu, Hong, Zacarias-Pons, Lluis, Eriksdotter, Maria, Chatterjee, Saikat, Garcia-Ptacek, Sara
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257644/
https://www.ncbi.nlm.nih.gov/pubmed/37301891
http://dx.doi.org/10.1038/s41598-023-36362-3
_version_ 1785057344116228096
author Mostafaei, Shayan
Hoang, Minh Tuan
Jurado, Pol Grau
Xu, Hong
Zacarias-Pons, Lluis
Eriksdotter, Maria
Chatterjee, Saikat
Garcia-Ptacek, Sara
author_facet Mostafaei, Shayan
Hoang, Minh Tuan
Jurado, Pol Grau
Xu, Hong
Zacarias-Pons, Lluis
Eriksdotter, Maria
Chatterjee, Saikat
Garcia-Ptacek, Sara
author_sort Mostafaei, Shayan
collection PubMed
description Machine learning (ML) could have advantages over traditional statistical models in identifying risk factors. Using ML algorithms, our objective was to identify the most important variables associated with mortality after dementia diagnosis in the Swedish Registry for Cognitive/Dementia Disorders (SveDem). From SveDem, a longitudinal cohort of 28,023 dementia-diagnosed patients was selected for this study. Sixty variables were considered as potential predictors of mortality risk, such as age at dementia diagnosis, dementia type, sex, body mass index (BMI), mini-mental state examination (MMSE) score, time from referral to initiation of work-up, time from initiation of work-up to diagnosis, dementia medications, comorbidities, and some specific medications for chronic comorbidities (e.g., cardiovascular disease). We applied sparsity-inducing penalties for three ML algorithms and identified twenty important variables for the binary classification task in mortality risk prediction and fifteen variables to predict time to death. Area-under-ROC curve (AUC) measure was used to evaluate the classification algorithms. Then, an unsupervised clustering algorithm was applied on the set of twenty-selected variables to find two main clusters which accurately matched surviving and dead patient clusters. A support-vector-machines with an appropriate sparsity penalty provided the classification of mortality risk with accuracy = 0.7077, AUROC = 0.7375, sensitivity = 0.6436, and specificity = 0.740. Across three ML algorithms, the majority of the identified twenty variables were compatible with literature and with our previous studies on SveDem. We also found new variables which were not previously reported in literature as associated with mortality in dementia. Performance of basic dementia diagnostic work-up, time from referral to initiation of work-up, and time from initiation of work-up to diagnosis were found to be elements of the diagnostic process identified by the ML algorithms. The median follow-up time was 1053 (IQR = 516–1771) days in surviving and 1125 (IQR = 605–1770) days in dead patients. For prediction of time to death, the CoxBoost model identified 15 variables and classified them in order of importance. These highly important variables were age at diagnosis, MMSE score, sex, BMI, and Charlson Comorbidity Index with selection scores of 23%, 15%, 14%, 12% and 10%, respectively. This study demonstrates the potential of sparsity-inducing ML algorithms in improving our understanding of mortality risk factors in dementia patients and their application in clinical settings. Moreover, ML methods can be used as a complement to traditional statistical methods.
format Online
Article
Text
id pubmed-10257644
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-102576442023-06-12 Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study Mostafaei, Shayan Hoang, Minh Tuan Jurado, Pol Grau Xu, Hong Zacarias-Pons, Lluis Eriksdotter, Maria Chatterjee, Saikat Garcia-Ptacek, Sara Sci Rep Article Machine learning (ML) could have advantages over traditional statistical models in identifying risk factors. Using ML algorithms, our objective was to identify the most important variables associated with mortality after dementia diagnosis in the Swedish Registry for Cognitive/Dementia Disorders (SveDem). From SveDem, a longitudinal cohort of 28,023 dementia-diagnosed patients was selected for this study. Sixty variables were considered as potential predictors of mortality risk, such as age at dementia diagnosis, dementia type, sex, body mass index (BMI), mini-mental state examination (MMSE) score, time from referral to initiation of work-up, time from initiation of work-up to diagnosis, dementia medications, comorbidities, and some specific medications for chronic comorbidities (e.g., cardiovascular disease). We applied sparsity-inducing penalties for three ML algorithms and identified twenty important variables for the binary classification task in mortality risk prediction and fifteen variables to predict time to death. Area-under-ROC curve (AUC) measure was used to evaluate the classification algorithms. Then, an unsupervised clustering algorithm was applied on the set of twenty-selected variables to find two main clusters which accurately matched surviving and dead patient clusters. A support-vector-machines with an appropriate sparsity penalty provided the classification of mortality risk with accuracy = 0.7077, AUROC = 0.7375, sensitivity = 0.6436, and specificity = 0.740. Across three ML algorithms, the majority of the identified twenty variables were compatible with literature and with our previous studies on SveDem. We also found new variables which were not previously reported in literature as associated with mortality in dementia. Performance of basic dementia diagnostic work-up, time from referral to initiation of work-up, and time from initiation of work-up to diagnosis were found to be elements of the diagnostic process identified by the ML algorithms. The median follow-up time was 1053 (IQR = 516–1771) days in surviving and 1125 (IQR = 605–1770) days in dead patients. For prediction of time to death, the CoxBoost model identified 15 variables and classified them in order of importance. These highly important variables were age at diagnosis, MMSE score, sex, BMI, and Charlson Comorbidity Index with selection scores of 23%, 15%, 14%, 12% and 10%, respectively. This study demonstrates the potential of sparsity-inducing ML algorithms in improving our understanding of mortality risk factors in dementia patients and their application in clinical settings. Moreover, ML methods can be used as a complement to traditional statistical methods. Nature Publishing Group UK 2023-06-10 /pmc/articles/PMC10257644/ /pubmed/37301891 http://dx.doi.org/10.1038/s41598-023-36362-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Mostafaei, Shayan
Hoang, Minh Tuan
Jurado, Pol Grau
Xu, Hong
Zacarias-Pons, Lluis
Eriksdotter, Maria
Chatterjee, Saikat
Garcia-Ptacek, Sara
Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study
title Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study
title_full Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study
title_fullStr Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study
title_full_unstemmed Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study
title_short Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study
title_sort machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257644/
https://www.ncbi.nlm.nih.gov/pubmed/37301891
http://dx.doi.org/10.1038/s41598-023-36362-3
work_keys_str_mv AT mostafaeishayan machinelearningalgorithmsforidentifyingpredictivevariablesofmortalityriskfollowingdementiadiagnosisalongitudinalcohortstudy
AT hoangminhtuan machinelearningalgorithmsforidentifyingpredictivevariablesofmortalityriskfollowingdementiadiagnosisalongitudinalcohortstudy
AT juradopolgrau machinelearningalgorithmsforidentifyingpredictivevariablesofmortalityriskfollowingdementiadiagnosisalongitudinalcohortstudy
AT xuhong machinelearningalgorithmsforidentifyingpredictivevariablesofmortalityriskfollowingdementiadiagnosisalongitudinalcohortstudy
AT zacariasponslluis machinelearningalgorithmsforidentifyingpredictivevariablesofmortalityriskfollowingdementiadiagnosisalongitudinalcohortstudy
AT eriksdottermaria machinelearningalgorithmsforidentifyingpredictivevariablesofmortalityriskfollowingdementiadiagnosisalongitudinalcohortstudy
AT chatterjeesaikat machinelearningalgorithmsforidentifyingpredictivevariablesofmortalityriskfollowingdementiadiagnosisalongitudinalcohortstudy
AT garciaptaceksara machinelearningalgorithmsforidentifyingpredictivevariablesofmortalityriskfollowingdementiadiagnosisalongitudinalcohortstudy