Cargando…
Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults
OBJECTIVE: To compare Cox models, machine learning (ML), and ensemble models combining both approaches, for prediction of stroke risk in a prospective study of Chinese adults. MATERIALS AND METHODS: We evaluated models for stroke risk at varying intervals of follow-up (<9 years, 0–3 years, 3–6 ye...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8324240/ https://www.ncbi.nlm.nih.gov/pubmed/33969418 http://dx.doi.org/10.1093/jamia/ocab068 |
_version_ | 1783731366188285952 |
---|---|
author | Chun, Matthew Clarke, Robert Cairns, Benjamin J Clifton, David Bennett, Derrick Chen, Yiping Guo, Yu Pei, Pei Lv, Jun Yu, Canqing Yang, Ling Li, Liming Chen, Zhengming Zhu, Tingting |
author_facet | Chun, Matthew Clarke, Robert Cairns, Benjamin J Clifton, David Bennett, Derrick Chen, Yiping Guo, Yu Pei, Pei Lv, Jun Yu, Canqing Yang, Ling Li, Liming Chen, Zhengming Zhu, Tingting |
author_sort | Chun, Matthew |
collection | PubMed |
description | OBJECTIVE: To compare Cox models, machine learning (ML), and ensemble models combining both approaches, for prediction of stroke risk in a prospective study of Chinese adults. MATERIALS AND METHODS: We evaluated models for stroke risk at varying intervals of follow-up (<9 years, 0–3 years, 3–6 years, 6–9 years) in 503 842 adults without prior history of stroke recruited from 10 areas in China in 2004–2008. Inputs included sociodemographic factors, diet, medical history, physical activity, and physical measurements. We compared discrimination and calibration of Cox regression, logistic regression, support vector machines, random survival forests, gradient boosted trees (GBT), and multilayer perceptrons, benchmarking performance against the 2017 Framingham Stroke Risk Profile. We then developed an ensemble approach to identify individuals at high risk of stroke (>10% predicted 9-yr stroke risk) by selectively applying either a GBT or Cox model based on individual-level characteristics. RESULTS: For 9-yr stroke risk prediction, GBT provided the best discrimination (AUROC: 0.833 in men, 0.836 in women) and calibration, with consistent results in each interval of follow-up. The ensemble approach yielded incrementally higher accuracy (men: 76%, women: 80%), specificity (men: 76%, women: 81%), and positive predictive value (men: 26%, women: 24%) compared to any of the single-model approaches. DISCUSSION AND CONCLUSION: Among several approaches, an ensemble model combining both GBT and Cox models achieved the best performance for identifying individuals at high risk of stroke in a contemporary study of Chinese adults. The results highlight the potential value of expanding the use of ML in clinical practice. |
format | Online Article Text |
id | pubmed-8324240 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-83242402021-08-02 Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults Chun, Matthew Clarke, Robert Cairns, Benjamin J Clifton, David Bennett, Derrick Chen, Yiping Guo, Yu Pei, Pei Lv, Jun Yu, Canqing Yang, Ling Li, Liming Chen, Zhengming Zhu, Tingting J Am Med Inform Assoc Research and Applications OBJECTIVE: To compare Cox models, machine learning (ML), and ensemble models combining both approaches, for prediction of stroke risk in a prospective study of Chinese adults. MATERIALS AND METHODS: We evaluated models for stroke risk at varying intervals of follow-up (<9 years, 0–3 years, 3–6 years, 6–9 years) in 503 842 adults without prior history of stroke recruited from 10 areas in China in 2004–2008. Inputs included sociodemographic factors, diet, medical history, physical activity, and physical measurements. We compared discrimination and calibration of Cox regression, logistic regression, support vector machines, random survival forests, gradient boosted trees (GBT), and multilayer perceptrons, benchmarking performance against the 2017 Framingham Stroke Risk Profile. We then developed an ensemble approach to identify individuals at high risk of stroke (>10% predicted 9-yr stroke risk) by selectively applying either a GBT or Cox model based on individual-level characteristics. RESULTS: For 9-yr stroke risk prediction, GBT provided the best discrimination (AUROC: 0.833 in men, 0.836 in women) and calibration, with consistent results in each interval of follow-up. The ensemble approach yielded incrementally higher accuracy (men: 76%, women: 80%), specificity (men: 76%, women: 81%), and positive predictive value (men: 26%, women: 24%) compared to any of the single-model approaches. DISCUSSION AND CONCLUSION: Among several approaches, an ensemble model combining both GBT and Cox models achieved the best performance for identifying individuals at high risk of stroke in a contemporary study of Chinese adults. The results highlight the potential value of expanding the use of ML in clinical practice. Oxford University Press 2021-05-09 /pmc/articles/PMC8324240/ /pubmed/33969418 http://dx.doi.org/10.1093/jamia/ocab068 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research and Applications Chun, Matthew Clarke, Robert Cairns, Benjamin J Clifton, David Bennett, Derrick Chen, Yiping Guo, Yu Pei, Pei Lv, Jun Yu, Canqing Yang, Ling Li, Liming Chen, Zhengming Zhu, Tingting Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults |
title | Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults |
title_full | Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults |
title_fullStr | Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults |
title_full_unstemmed | Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults |
title_short | Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults |
title_sort | stroke risk prediction using machine learning: a prospective cohort study of 0.5 million chinese adults |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8324240/ https://www.ncbi.nlm.nih.gov/pubmed/33969418 http://dx.doi.org/10.1093/jamia/ocab068 |
work_keys_str_mv | AT chunmatthew strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT clarkerobert strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT cairnsbenjaminj strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT cliftondavid strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT bennettderrick strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT chenyiping strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT guoyu strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT peipei strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT lvjun strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT yucanqing strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT yangling strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT liliming strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT chenzhengming strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT zhutingting strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults AT strokeriskpredictionusingmachinelearningaprospectivecohortstudyof05millionchineseadults |