Cargando…

Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach

In this work we present a three-stage Machine Learning strategy to country-level risk classification based on countries that are reporting COVID-19 information. A K% binning discretisation (K = 25) is used to create four risk groups of countries based on the risk of transmission (coronavirus cases p...

Descripción completa

Detalles Bibliográficos
Autores principales: Bird, Jordan J., Barnes, Chloe M., Premebida, Cristiano, Ekárt, Anikó, Faria, Diego R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592809/
https://www.ncbi.nlm.nih.gov/pubmed/33112931
http://dx.doi.org/10.1371/journal.pone.0241332
_version_ 1783601259388862464
author Bird, Jordan J.
Barnes, Chloe M.
Premebida, Cristiano
Ekárt, Anikó
Faria, Diego R.
author_facet Bird, Jordan J.
Barnes, Chloe M.
Premebida, Cristiano
Ekárt, Anikó
Faria, Diego R.
author_sort Bird, Jordan J.
collection PubMed
description In this work we present a three-stage Machine Learning strategy to country-level risk classification based on countries that are reporting COVID-19 information. A K% binning discretisation (K = 25) is used to create four risk groups of countries based on the risk of transmission (coronavirus cases per million population), risk of mortality (coronavirus deaths per million population), and risk of inability to test (coronavirus tests per million population). The four risk groups produced by K% binning are labelled as ‘low’, ‘medium-low’, ‘medium-high’, and ‘high’. Coronavirus-related data are then removed and the attributes for prediction of the three types of risk are given as the geopolitical and demographic data describing each country. Thus, the calculation of class label is based on coronavirus data but the input attributes are country-level information regardless of coronavirus data. The three four-class classification problems are then explored and benchmarked through leave-one-country-out cross validation to find the strongest model, producing a Stack of Gradient Boosting and Decision Tree algorithms for risk of transmission, a Stack of Support Vector Machine and Extra Trees for risk of mortality, and a Gradient Boosting algorithm for the risk of inability to test. It is noted that high risk for inability to test is often coupled with low risks for transmission and mortality, therefore the risk of inability to test should be interpreted first, before consideration is given to the predicted transmission and mortality risks. Finally, the approach is applied to more recent risk levels to data from September 2020 and weaker results are noted due to the growth of international collaboration detracting useful knowledge from country-level attributes which suggests that similar machine learning approaches are more useful prior to situations later unfolding.
format Online
Article
Text
id pubmed-7592809
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-75928092020-11-02 Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach Bird, Jordan J. Barnes, Chloe M. Premebida, Cristiano Ekárt, Anikó Faria, Diego R. PLoS One Research Article In this work we present a three-stage Machine Learning strategy to country-level risk classification based on countries that are reporting COVID-19 information. A K% binning discretisation (K = 25) is used to create four risk groups of countries based on the risk of transmission (coronavirus cases per million population), risk of mortality (coronavirus deaths per million population), and risk of inability to test (coronavirus tests per million population). The four risk groups produced by K% binning are labelled as ‘low’, ‘medium-low’, ‘medium-high’, and ‘high’. Coronavirus-related data are then removed and the attributes for prediction of the three types of risk are given as the geopolitical and demographic data describing each country. Thus, the calculation of class label is based on coronavirus data but the input attributes are country-level information regardless of coronavirus data. The three four-class classification problems are then explored and benchmarked through leave-one-country-out cross validation to find the strongest model, producing a Stack of Gradient Boosting and Decision Tree algorithms for risk of transmission, a Stack of Support Vector Machine and Extra Trees for risk of mortality, and a Gradient Boosting algorithm for the risk of inability to test. It is noted that high risk for inability to test is often coupled with low risks for transmission and mortality, therefore the risk of inability to test should be interpreted first, before consideration is given to the predicted transmission and mortality risks. Finally, the approach is applied to more recent risk levels to data from September 2020 and weaker results are noted due to the growth of international collaboration detracting useful knowledge from country-level attributes which suggests that similar machine learning approaches are more useful prior to situations later unfolding. Public Library of Science 2020-10-28 /pmc/articles/PMC7592809/ /pubmed/33112931 http://dx.doi.org/10.1371/journal.pone.0241332 Text en © 2020 Bird et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bird, Jordan J.
Barnes, Chloe M.
Premebida, Cristiano
Ekárt, Anikó
Faria, Diego R.
Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach
title Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach
title_full Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach
title_fullStr Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach
title_full_unstemmed Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach
title_short Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach
title_sort country-level pandemic risk and preparedness classification based on covid-19 data: a machine learning approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592809/
https://www.ncbi.nlm.nih.gov/pubmed/33112931
http://dx.doi.org/10.1371/journal.pone.0241332
work_keys_str_mv AT birdjordanj countrylevelpandemicriskandpreparednessclassificationbasedoncovid19dataamachinelearningapproach
AT barneschloem countrylevelpandemicriskandpreparednessclassificationbasedoncovid19dataamachinelearningapproach
AT premebidacristiano countrylevelpandemicriskandpreparednessclassificationbasedoncovid19dataamachinelearningapproach
AT ekartaniko countrylevelpandemicriskandpreparednessclassificationbasedoncovid19dataamachinelearningapproach
AT fariadiegor countrylevelpandemicriskandpreparednessclassificationbasedoncovid19dataamachinelearningapproach