Cargando…
A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity
Understanding sociodemographic factors behind COVID-19 severity relates to significant methodological difficulties, such as differences in testing policies and epidemics phase, as well as a large number of predictors that can potentially contribute to severity. To account for these difficulties, we...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10080051/ https://www.ncbi.nlm.nih.gov/pubmed/37034433 http://dx.doi.org/10.3389/fdata.2023.1038283 |
_version_ | 1785020840670134272 |
---|---|
author | Tumbas, Marko Markovic, Sofija Salom, Igor Djordjevic, Marko |
author_facet | Tumbas, Marko Markovic, Sofija Salom, Igor Djordjevic, Marko |
author_sort | Tumbas, Marko |
collection | PubMed |
description | Understanding sociodemographic factors behind COVID-19 severity relates to significant methodological difficulties, such as differences in testing policies and epidemics phase, as well as a large number of predictors that can potentially contribute to severity. To account for these difficulties, we assemble 115 predictors for more than 3,000 US counties and employ a well-defined COVID-19 severity measure derived from epidemiological dynamics modeling. We then use a number of advanced feature selection techniques from machine learning to determine which of these predictors significantly impact the disease severity. We obtain a surprisingly simple result, where only two variables are clearly and robustly selected—population density and proportion of African Americans. Possible causes behind this result are discussed. We argue that the approach may be useful whenever significant determinants of disease progression over diverse geographic regions should be selected from a large number of potentially important factors. |
format | Online Article Text |
id | pubmed-10080051 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-100800512023-04-08 A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity Tumbas, Marko Markovic, Sofija Salom, Igor Djordjevic, Marko Front Big Data Big Data Understanding sociodemographic factors behind COVID-19 severity relates to significant methodological difficulties, such as differences in testing policies and epidemics phase, as well as a large number of predictors that can potentially contribute to severity. To account for these difficulties, we assemble 115 predictors for more than 3,000 US counties and employ a well-defined COVID-19 severity measure derived from epidemiological dynamics modeling. We then use a number of advanced feature selection techniques from machine learning to determine which of these predictors significantly impact the disease severity. We obtain a surprisingly simple result, where only two variables are clearly and robustly selected—population density and proportion of African Americans. Possible causes behind this result are discussed. We argue that the approach may be useful whenever significant determinants of disease progression over diverse geographic regions should be selected from a large number of potentially important factors. Frontiers Media S.A. 2023-03-24 /pmc/articles/PMC10080051/ /pubmed/37034433 http://dx.doi.org/10.3389/fdata.2023.1038283 Text en Copyright © 2023 Tumbas, Markovic, Salom and Djordjevic. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Big Data Tumbas, Marko Markovic, Sofija Salom, Igor Djordjevic, Marko A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity |
title | A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity |
title_full | A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity |
title_fullStr | A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity |
title_full_unstemmed | A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity |
title_short | A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity |
title_sort | large-scale machine learning study of sociodemographic factors contributing to covid-19 severity |
topic | Big Data |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10080051/ https://www.ncbi.nlm.nih.gov/pubmed/37034433 http://dx.doi.org/10.3389/fdata.2023.1038283 |
work_keys_str_mv | AT tumbasmarko alargescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity AT markovicsofija alargescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity AT salomigor alargescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity AT djordjevicmarko alargescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity AT tumbasmarko largescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity AT markovicsofija largescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity AT salomigor largescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity AT djordjevicmarko largescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity |