Cargando…

A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity

Understanding sociodemographic factors behind COVID-19 severity relates to significant methodological difficulties, such as differences in testing policies and epidemics phase, as well as a large number of predictors that can potentially contribute to severity. To account for these difficulties, we...

Descripción completa

Detalles Bibliográficos
Autores principales: Tumbas, Marko, Markovic, Sofija, Salom, Igor, Djordjevic, Marko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10080051/
https://www.ncbi.nlm.nih.gov/pubmed/37034433
http://dx.doi.org/10.3389/fdata.2023.1038283
_version_ 1785020840670134272
author Tumbas, Marko
Markovic, Sofija
Salom, Igor
Djordjevic, Marko
author_facet Tumbas, Marko
Markovic, Sofija
Salom, Igor
Djordjevic, Marko
author_sort Tumbas, Marko
collection PubMed
description Understanding sociodemographic factors behind COVID-19 severity relates to significant methodological difficulties, such as differences in testing policies and epidemics phase, as well as a large number of predictors that can potentially contribute to severity. To account for these difficulties, we assemble 115 predictors for more than 3,000 US counties and employ a well-defined COVID-19 severity measure derived from epidemiological dynamics modeling. We then use a number of advanced feature selection techniques from machine learning to determine which of these predictors significantly impact the disease severity. We obtain a surprisingly simple result, where only two variables are clearly and robustly selected—population density and proportion of African Americans. Possible causes behind this result are discussed. We argue that the approach may be useful whenever significant determinants of disease progression over diverse geographic regions should be selected from a large number of potentially important factors.
format Online
Article
Text
id pubmed-10080051
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-100800512023-04-08 A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity Tumbas, Marko Markovic, Sofija Salom, Igor Djordjevic, Marko Front Big Data Big Data Understanding sociodemographic factors behind COVID-19 severity relates to significant methodological difficulties, such as differences in testing policies and epidemics phase, as well as a large number of predictors that can potentially contribute to severity. To account for these difficulties, we assemble 115 predictors for more than 3,000 US counties and employ a well-defined COVID-19 severity measure derived from epidemiological dynamics modeling. We then use a number of advanced feature selection techniques from machine learning to determine which of these predictors significantly impact the disease severity. We obtain a surprisingly simple result, where only two variables are clearly and robustly selected—population density and proportion of African Americans. Possible causes behind this result are discussed. We argue that the approach may be useful whenever significant determinants of disease progression over diverse geographic regions should be selected from a large number of potentially important factors. Frontiers Media S.A. 2023-03-24 /pmc/articles/PMC10080051/ /pubmed/37034433 http://dx.doi.org/10.3389/fdata.2023.1038283 Text en Copyright © 2023 Tumbas, Markovic, Salom and Djordjevic. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Tumbas, Marko
Markovic, Sofija
Salom, Igor
Djordjevic, Marko
A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity
title A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity
title_full A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity
title_fullStr A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity
title_full_unstemmed A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity
title_short A large-scale machine learning study of sociodemographic factors contributing to COVID-19 severity
title_sort large-scale machine learning study of sociodemographic factors contributing to covid-19 severity
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10080051/
https://www.ncbi.nlm.nih.gov/pubmed/37034433
http://dx.doi.org/10.3389/fdata.2023.1038283
work_keys_str_mv AT tumbasmarko alargescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity
AT markovicsofija alargescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity
AT salomigor alargescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity
AT djordjevicmarko alargescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity
AT tumbasmarko largescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity
AT markovicsofija largescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity
AT salomigor largescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity
AT djordjevicmarko largescalemachinelearningstudyofsociodemographicfactorscontributingtocovid19severity