Cargando…

Generalizable prediction of COVID-19 mortality on worldwide patient data

OBJECTIVE: Predicting Coronavirus disease 2019 (COVID-19) mortality for patients is critical for early-stage care and intervention. Existing studies mainly built models on datasets with limited geographical range or size. In this study, we developed COVID-19 mortality prediction models on worldwide,...

Descripción completa

Detalles Bibliográficos
Autores principales: Edelson, Maxim, Kuo, Tsung-Ting
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9129227/
https://www.ncbi.nlm.nih.gov/pubmed/35663116
http://dx.doi.org/10.1093/jamiaopen/ooac036
_version_ 1784712702662279168
author Edelson, Maxim
Kuo, Tsung-Ting
author_facet Edelson, Maxim
Kuo, Tsung-Ting
author_sort Edelson, Maxim
collection PubMed
description OBJECTIVE: Predicting Coronavirus disease 2019 (COVID-19) mortality for patients is critical for early-stage care and intervention. Existing studies mainly built models on datasets with limited geographical range or size. In this study, we developed COVID-19 mortality prediction models on worldwide, large-scale “sparse” data and on a “dense” subset of the data. MATERIALS AND METHODS: We evaluated 6 classifiers, including logistic regression (LR), support vector machine (SVM), random forest (RF), multilayer perceptron (MLP), AdaBoost (AB), and Naive Bayes (NB). We also conducted temporal analysis and calibrated our models using Isotonic Regression. RESULTS: The results showed that AB outperformed the other classifiers for the sparse dataset, while LR provided the highest-performing results for the dense dataset (with area under the receiver operating characteristic curve, or AUC ≈ 0.7 for the sparse dataset and AUC = 0.963 for the dense one). We also identified impactful features such as symptoms, countries, age, and the date of death/discharge. All our models are well-calibrated (P > .1). DISCUSSION: Our results highlight the tradeoff of using sparse training data to increase generalizability versus training on denser data, which produces higher discrimination results. We found that covariates such as patient information on symptoms, countries (where the case was reported), age, and the date of discharge from the hospital or death were the most important for mortality prediction. CONCLUSION: This study is a stepping-stone towards improving healthcare quality during the COVID-19 era and potentially other pandemics. Our code is publicly available at: https://doi.org/10.5281/zenodo.6336231.
format Online
Article
Text
id pubmed-9129227
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91292272022-05-25 Generalizable prediction of COVID-19 mortality on worldwide patient data Edelson, Maxim Kuo, Tsung-Ting JAMIA Open Research and Applications OBJECTIVE: Predicting Coronavirus disease 2019 (COVID-19) mortality for patients is critical for early-stage care and intervention. Existing studies mainly built models on datasets with limited geographical range or size. In this study, we developed COVID-19 mortality prediction models on worldwide, large-scale “sparse” data and on a “dense” subset of the data. MATERIALS AND METHODS: We evaluated 6 classifiers, including logistic regression (LR), support vector machine (SVM), random forest (RF), multilayer perceptron (MLP), AdaBoost (AB), and Naive Bayes (NB). We also conducted temporal analysis and calibrated our models using Isotonic Regression. RESULTS: The results showed that AB outperformed the other classifiers for the sparse dataset, while LR provided the highest-performing results for the dense dataset (with area under the receiver operating characteristic curve, or AUC ≈ 0.7 for the sparse dataset and AUC = 0.963 for the dense one). We also identified impactful features such as symptoms, countries, age, and the date of death/discharge. All our models are well-calibrated (P > .1). DISCUSSION: Our results highlight the tradeoff of using sparse training data to increase generalizability versus training on denser data, which produces higher discrimination results. We found that covariates such as patient information on symptoms, countries (where the case was reported), age, and the date of discharge from the hospital or death were the most important for mortality prediction. CONCLUSION: This study is a stepping-stone towards improving healthcare quality during the COVID-19 era and potentially other pandemics. Our code is publicly available at: https://doi.org/10.5281/zenodo.6336231. Oxford University Press 2022-05-25 /pmc/articles/PMC9129227/ /pubmed/35663116 http://dx.doi.org/10.1093/jamiaopen/ooac036 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Edelson, Maxim
Kuo, Tsung-Ting
Generalizable prediction of COVID-19 mortality on worldwide patient data
title Generalizable prediction of COVID-19 mortality on worldwide patient data
title_full Generalizable prediction of COVID-19 mortality on worldwide patient data
title_fullStr Generalizable prediction of COVID-19 mortality on worldwide patient data
title_full_unstemmed Generalizable prediction of COVID-19 mortality on worldwide patient data
title_short Generalizable prediction of COVID-19 mortality on worldwide patient data
title_sort generalizable prediction of covid-19 mortality on worldwide patient data
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9129227/
https://www.ncbi.nlm.nih.gov/pubmed/35663116
http://dx.doi.org/10.1093/jamiaopen/ooac036
work_keys_str_mv AT edelsonmaxim generalizablepredictionofcovid19mortalityonworldwidepatientdata
AT kuotsungting generalizablepredictionofcovid19mortalityonworldwidepatientdata