Cargando…

Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19

The COVID-19 pandemic affected the whole world, but not all countries were impacted equally. This opens the question of what factors can explain the initial faster spread in some countries compared to others. Many such factors are overshadowed by the effect of the countermeasures, so we studied the...

Descripción completa

Detalles Bibliográficos
Autores principales: Janko, Vito, Slapničar, Gašper, Dovgan, Erik, Reščič, Nina, Kolenik, Tine, Gjoreski, Martin, Smerkol, Maj, Gams, Matjaž, Luštrek, Mitja
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8268491/
https://www.ncbi.nlm.nih.gov/pubmed/34201618
http://dx.doi.org/10.3390/ijerph18136750
_version_ 1783720370174427136
author Janko, Vito
Slapničar, Gašper
Dovgan, Erik
Reščič, Nina
Kolenik, Tine
Gjoreski, Martin
Smerkol, Maj
Gams, Matjaž
Luštrek, Mitja
author_facet Janko, Vito
Slapničar, Gašper
Dovgan, Erik
Reščič, Nina
Kolenik, Tine
Gjoreski, Martin
Smerkol, Maj
Gams, Matjaž
Luštrek, Mitja
author_sort Janko, Vito
collection PubMed
description The COVID-19 pandemic affected the whole world, but not all countries were impacted equally. This opens the question of what factors can explain the initial faster spread in some countries compared to others. Many such factors are overshadowed by the effect of the countermeasures, so we studied the early phases of the infection when countermeasures had not yet taken place. We collected the most diverse dataset of potentially relevant factors and infection metrics to date for this task. Using it, we show the importance of different factors and factor categories as determined by both statistical methods and machine learning (ML) feature selection (FS) approaches. Factors related to culture (e.g., individualism, openness), development, and travel proved the most important. A more thorough factor analysis was then made using a novel rule discovery algorithm. We also show how interconnected these factors are and caution against relying on ML analysis in isolation. Importantly, we explore potential pitfalls found in the methodology of similar work and demonstrate their impact on COVID-19 data analysis. Our best models using the decision tree classifier can predict the infection class with roughly 80% accuracy.
format Online
Article
Text
id pubmed-8268491
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-82684912021-07-10 Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19 Janko, Vito Slapničar, Gašper Dovgan, Erik Reščič, Nina Kolenik, Tine Gjoreski, Martin Smerkol, Maj Gams, Matjaž Luštrek, Mitja Int J Environ Res Public Health Article The COVID-19 pandemic affected the whole world, but not all countries were impacted equally. This opens the question of what factors can explain the initial faster spread in some countries compared to others. Many such factors are overshadowed by the effect of the countermeasures, so we studied the early phases of the infection when countermeasures had not yet taken place. We collected the most diverse dataset of potentially relevant factors and infection metrics to date for this task. Using it, we show the importance of different factors and factor categories as determined by both statistical methods and machine learning (ML) feature selection (FS) approaches. Factors related to culture (e.g., individualism, openness), development, and travel proved the most important. A more thorough factor analysis was then made using a novel rule discovery algorithm. We also show how interconnected these factors are and caution against relying on ML analysis in isolation. Importantly, we explore potential pitfalls found in the methodology of similar work and demonstrate their impact on COVID-19 data analysis. Our best models using the decision tree classifier can predict the infection class with roughly 80% accuracy. MDPI 2021-06-23 /pmc/articles/PMC8268491/ /pubmed/34201618 http://dx.doi.org/10.3390/ijerph18136750 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Janko, Vito
Slapničar, Gašper
Dovgan, Erik
Reščič, Nina
Kolenik, Tine
Gjoreski, Martin
Smerkol, Maj
Gams, Matjaž
Luštrek, Mitja
Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19
title Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19
title_full Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19
title_fullStr Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19
title_full_unstemmed Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19
title_short Machine Learning for Analyzing Non-Countermeasure Factors Affecting Early Spread of COVID-19
title_sort machine learning for analyzing non-countermeasure factors affecting early spread of covid-19
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8268491/
https://www.ncbi.nlm.nih.gov/pubmed/34201618
http://dx.doi.org/10.3390/ijerph18136750
work_keys_str_mv AT jankovito machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19
AT slapnicargasper machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19
AT dovganerik machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19
AT rescicnina machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19
AT koleniktine machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19
AT gjoreskimartin machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19
AT smerkolmaj machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19
AT gamsmatjaz machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19
AT lustrekmitja machinelearningforanalyzingnoncountermeasurefactorsaffectingearlyspreadofcovid19