Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada

The Severe Acute Respiratory Syndrome Coronavirus 2 pandemic has challenged medical systems to the brink of collapse around the globe. In this paper, logistic regression and three other artificial intelligence models (XGBoost, Artificial Neural Network and Random Forest) are described and used to pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Snider, Brett, McBean, Edward A., Yawney, John, Gadsden, S. Andrew, Patel, Bhumi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8255789/
https://www.ncbi.nlm.nih.gov/pubmed/34235131
http://dx.doi.org/10.3389/fpubh.2021.675766
_version_ 1783717980784295936
author Snider, Brett
McBean, Edward A.
Yawney, John
Gadsden, S. Andrew
Patel, Bhumi
author_facet Snider, Brett
McBean, Edward A.
Yawney, John
Gadsden, S. Andrew
Patel, Bhumi
author_sort Snider, Brett
collection PubMed
description The Severe Acute Respiratory Syndrome Coronavirus 2 pandemic has challenged medical systems to the brink of collapse around the globe. In this paper, logistic regression and three other artificial intelligence models (XGBoost, Artificial Neural Network and Random Forest) are described and used to predict mortality risk of individual patients. The database is based on census data for the designated area and co-morbidities obtained using data from the Ontario Health Data Platform. The dataset consisted of more than 280,000 COVID-19 cases in Ontario for a wide-range of age groups; 0–9, 10–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70–79, 80–89, and 90+. Findings resulting from using logistic regression, XGBoost, Artificial Neural Network and Random Forest, all demonstrate excellent discrimination (area under the curve for all models exceeded 0.948 with the best performance being 0.956 for an XGBoost model). Based on SHapley Additive exPlanations values, the importance of 24 variables are identified, and the findings indicated the highest importance variables are, in order of importance, age, date of test, sex, and presence/absence of chronic dementia. The findings from this study allow the identification of out-patients who are likely to deteriorate into severe cases, allowing medical professionals to make decisions on timely treatments. Furthermore, the methodology and results may be extended to other public health regions.
format Online
Article
Text
id pubmed-8255789
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-82557892021-07-06 Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada Snider, Brett McBean, Edward A. Yawney, John Gadsden, S. Andrew Patel, Bhumi Front Public Health Public Health The Severe Acute Respiratory Syndrome Coronavirus 2 pandemic has challenged medical systems to the brink of collapse around the globe. In this paper, logistic regression and three other artificial intelligence models (XGBoost, Artificial Neural Network and Random Forest) are described and used to predict mortality risk of individual patients. The database is based on census data for the designated area and co-morbidities obtained using data from the Ontario Health Data Platform. The dataset consisted of more than 280,000 COVID-19 cases in Ontario for a wide-range of age groups; 0–9, 10–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70–79, 80–89, and 90+. Findings resulting from using logistic regression, XGBoost, Artificial Neural Network and Random Forest, all demonstrate excellent discrimination (area under the curve for all models exceeded 0.948 with the best performance being 0.956 for an XGBoost model). Based on SHapley Additive exPlanations values, the importance of 24 variables are identified, and the findings indicated the highest importance variables are, in order of importance, age, date of test, sex, and presence/absence of chronic dementia. The findings from this study allow the identification of out-patients who are likely to deteriorate into severe cases, allowing medical professionals to make decisions on timely treatments. Furthermore, the methodology and results may be extended to other public health regions. Frontiers Media S.A. 2021-06-21 /pmc/articles/PMC8255789/ /pubmed/34235131 http://dx.doi.org/10.3389/fpubh.2021.675766 Text en Copyright © 2021 Snider, McBean, Yawney, Gadsden and Patel. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Public Health
Snider, Brett
McBean, Edward A.
Yawney, John
Gadsden, S. Andrew
Patel, Bhumi
Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada
title Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada
title_full Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada
title_fullStr Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada
title_full_unstemmed Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada
title_short Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada
title_sort identification of variable importance for predictions of mortality from covid-19 using ai models for ontario, canada
topic Public Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8255789/
https://www.ncbi.nlm.nih.gov/pubmed/34235131
http://dx.doi.org/10.3389/fpubh.2021.675766
work_keys_str_mv AT sniderbrett identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada
AT mcbeanedwarda identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada
AT yawneyjohn identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada
AT gadsdensandrew identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada
AT patelbhumi identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada