Cargando…
Generalizable prediction of COVID-19 mortality on worldwide patient data
OBJECTIVE: Predicting Coronavirus disease 2019 (COVID-19) mortality for patients is critical for early-stage care and intervention. Existing studies mainly built models on datasets with limited geographical range or size. In this study, we developed COVID-19 mortality prediction models on worldwide,...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9129227/ https://www.ncbi.nlm.nih.gov/pubmed/35663116 http://dx.doi.org/10.1093/jamiaopen/ooac036 |
_version_ | 1784712702662279168 |
---|---|
author | Edelson, Maxim Kuo, Tsung-Ting |
author_facet | Edelson, Maxim Kuo, Tsung-Ting |
author_sort | Edelson, Maxim |
collection | PubMed |
description | OBJECTIVE: Predicting Coronavirus disease 2019 (COVID-19) mortality for patients is critical for early-stage care and intervention. Existing studies mainly built models on datasets with limited geographical range or size. In this study, we developed COVID-19 mortality prediction models on worldwide, large-scale “sparse” data and on a “dense” subset of the data. MATERIALS AND METHODS: We evaluated 6 classifiers, including logistic regression (LR), support vector machine (SVM), random forest (RF), multilayer perceptron (MLP), AdaBoost (AB), and Naive Bayes (NB). We also conducted temporal analysis and calibrated our models using Isotonic Regression. RESULTS: The results showed that AB outperformed the other classifiers for the sparse dataset, while LR provided the highest-performing results for the dense dataset (with area under the receiver operating characteristic curve, or AUC ≈ 0.7 for the sparse dataset and AUC = 0.963 for the dense one). We also identified impactful features such as symptoms, countries, age, and the date of death/discharge. All our models are well-calibrated (P > .1). DISCUSSION: Our results highlight the tradeoff of using sparse training data to increase generalizability versus training on denser data, which produces higher discrimination results. We found that covariates such as patient information on symptoms, countries (where the case was reported), age, and the date of discharge from the hospital or death were the most important for mortality prediction. CONCLUSION: This study is a stepping-stone towards improving healthcare quality during the COVID-19 era and potentially other pandemics. Our code is publicly available at: https://doi.org/10.5281/zenodo.6336231. |
format | Online Article Text |
id | pubmed-9129227 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-91292272022-05-25 Generalizable prediction of COVID-19 mortality on worldwide patient data Edelson, Maxim Kuo, Tsung-Ting JAMIA Open Research and Applications OBJECTIVE: Predicting Coronavirus disease 2019 (COVID-19) mortality for patients is critical for early-stage care and intervention. Existing studies mainly built models on datasets with limited geographical range or size. In this study, we developed COVID-19 mortality prediction models on worldwide, large-scale “sparse” data and on a “dense” subset of the data. MATERIALS AND METHODS: We evaluated 6 classifiers, including logistic regression (LR), support vector machine (SVM), random forest (RF), multilayer perceptron (MLP), AdaBoost (AB), and Naive Bayes (NB). We also conducted temporal analysis and calibrated our models using Isotonic Regression. RESULTS: The results showed that AB outperformed the other classifiers for the sparse dataset, while LR provided the highest-performing results for the dense dataset (with area under the receiver operating characteristic curve, or AUC ≈ 0.7 for the sparse dataset and AUC = 0.963 for the dense one). We also identified impactful features such as symptoms, countries, age, and the date of death/discharge. All our models are well-calibrated (P > .1). DISCUSSION: Our results highlight the tradeoff of using sparse training data to increase generalizability versus training on denser data, which produces higher discrimination results. We found that covariates such as patient information on symptoms, countries (where the case was reported), age, and the date of discharge from the hospital or death were the most important for mortality prediction. CONCLUSION: This study is a stepping-stone towards improving healthcare quality during the COVID-19 era and potentially other pandemics. Our code is publicly available at: https://doi.org/10.5281/zenodo.6336231. Oxford University Press 2022-05-25 /pmc/articles/PMC9129227/ /pubmed/35663116 http://dx.doi.org/10.1093/jamiaopen/ooac036 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research and Applications Edelson, Maxim Kuo, Tsung-Ting Generalizable prediction of COVID-19 mortality on worldwide patient data |
title | Generalizable prediction of COVID-19 mortality on worldwide patient
data |
title_full | Generalizable prediction of COVID-19 mortality on worldwide patient
data |
title_fullStr | Generalizable prediction of COVID-19 mortality on worldwide patient
data |
title_full_unstemmed | Generalizable prediction of COVID-19 mortality on worldwide patient
data |
title_short | Generalizable prediction of COVID-19 mortality on worldwide patient
data |
title_sort | generalizable prediction of covid-19 mortality on worldwide patient
data |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9129227/ https://www.ncbi.nlm.nih.gov/pubmed/35663116 http://dx.doi.org/10.1093/jamiaopen/ooac036 |
work_keys_str_mv | AT edelsonmaxim generalizablepredictionofcovid19mortalityonworldwidepatientdata AT kuotsungting generalizablepredictionofcovid19mortalityonworldwidepatientdata |