Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada
The Severe Acute Respiratory Syndrome Coronavirus 2 pandemic has challenged medical systems to the brink of collapse around the globe. In this paper, logistic regression and three other artificial intelligence models (XGBoost, Artificial Neural Network and Random Forest) are described and used to pr...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8255789/ https://www.ncbi.nlm.nih.gov/pubmed/34235131 http://dx.doi.org/10.3389/fpubh.2021.675766 |
_version_ | 1783717980784295936 |
---|---|
author | Snider, Brett McBean, Edward A. Yawney, John Gadsden, S. Andrew Patel, Bhumi |
author_facet | Snider, Brett McBean, Edward A. Yawney, John Gadsden, S. Andrew Patel, Bhumi |
author_sort | Snider, Brett |
collection | PubMed |
description | The Severe Acute Respiratory Syndrome Coronavirus 2 pandemic has challenged medical systems to the brink of collapse around the globe. In this paper, logistic regression and three other artificial intelligence models (XGBoost, Artificial Neural Network and Random Forest) are described and used to predict mortality risk of individual patients. The database is based on census data for the designated area and co-morbidities obtained using data from the Ontario Health Data Platform. The dataset consisted of more than 280,000 COVID-19 cases in Ontario for a wide-range of age groups; 0–9, 10–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70–79, 80–89, and 90+. Findings resulting from using logistic regression, XGBoost, Artificial Neural Network and Random Forest, all demonstrate excellent discrimination (area under the curve for all models exceeded 0.948 with the best performance being 0.956 for an XGBoost model). Based on SHapley Additive exPlanations values, the importance of 24 variables are identified, and the findings indicated the highest importance variables are, in order of importance, age, date of test, sex, and presence/absence of chronic dementia. The findings from this study allow the identification of out-patients who are likely to deteriorate into severe cases, allowing medical professionals to make decisions on timely treatments. Furthermore, the methodology and results may be extended to other public health regions. |
format | Online Article Text |
id | pubmed-8255789 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82557892021-07-06 Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada Snider, Brett McBean, Edward A. Yawney, John Gadsden, S. Andrew Patel, Bhumi Front Public Health Public Health The Severe Acute Respiratory Syndrome Coronavirus 2 pandemic has challenged medical systems to the brink of collapse around the globe. In this paper, logistic regression and three other artificial intelligence models (XGBoost, Artificial Neural Network and Random Forest) are described and used to predict mortality risk of individual patients. The database is based on census data for the designated area and co-morbidities obtained using data from the Ontario Health Data Platform. The dataset consisted of more than 280,000 COVID-19 cases in Ontario for a wide-range of age groups; 0–9, 10–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70–79, 80–89, and 90+. Findings resulting from using logistic regression, XGBoost, Artificial Neural Network and Random Forest, all demonstrate excellent discrimination (area under the curve for all models exceeded 0.948 with the best performance being 0.956 for an XGBoost model). Based on SHapley Additive exPlanations values, the importance of 24 variables are identified, and the findings indicated the highest importance variables are, in order of importance, age, date of test, sex, and presence/absence of chronic dementia. The findings from this study allow the identification of out-patients who are likely to deteriorate into severe cases, allowing medical professionals to make decisions on timely treatments. Furthermore, the methodology and results may be extended to other public health regions. Frontiers Media S.A. 2021-06-21 /pmc/articles/PMC8255789/ /pubmed/34235131 http://dx.doi.org/10.3389/fpubh.2021.675766 Text en Copyright © 2021 Snider, McBean, Yawney, Gadsden and Patel. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Public Health Snider, Brett McBean, Edward A. Yawney, John Gadsden, S. Andrew Patel, Bhumi Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada |
title | Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada |
title_full | Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada |
title_fullStr | Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada |
title_full_unstemmed | Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada |
title_short | Identification of Variable Importance for Predictions of Mortality From COVID-19 Using AI Models for Ontario, Canada |
title_sort | identification of variable importance for predictions of mortality from covid-19 using ai models for ontario, canada |
topic | Public Health |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8255789/ https://www.ncbi.nlm.nih.gov/pubmed/34235131 http://dx.doi.org/10.3389/fpubh.2021.675766 |
work_keys_str_mv | AT sniderbrett identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada AT mcbeanedwarda identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada AT yawneyjohn identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada AT gadsdensandrew identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada AT patelbhumi identificationofvariableimportanceforpredictionsofmortalityfromcovid19usingaimodelsforontariocanada |