Cargando…

Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach

BACKGROUND: The rapid spread of COVID-19 means that government and health services providers have little time to plan and design effective response policies. It is therefore important to quickly provide accurate predictions of how vulnerable geographic regions such as counties are to the spread of t...

Descripción completa

Detalles Bibliográficos
Autores principales: Mehta, Mihir, Julaiti, Juxihong, Griffin, Paul, Kumara, Soundar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7490002/
https://www.ncbi.nlm.nih.gov/pubmed/32784193
http://dx.doi.org/10.2196/19446
_version_ 1783581963775377408
author Mehta, Mihir
Julaiti, Juxihong
Griffin, Paul
Kumara, Soundar
author_facet Mehta, Mihir
Julaiti, Juxihong
Griffin, Paul
Kumara, Soundar
author_sort Mehta, Mihir
collection PubMed
description BACKGROUND: The rapid spread of COVID-19 means that government and health services providers have little time to plan and design effective response policies. It is therefore important to quickly provide accurate predictions of how vulnerable geographic regions such as counties are to the spread of this virus. OBJECTIVE: The aim of this study is to develop county-level prediction around near future disease movement for COVID-19 occurrences using publicly available data. METHODS: We estimated county-level COVID-19 occurrences for the period March 14 to 31, 2020, based on data fused from multiple publicly available sources inclusive of health statistics, demographics, and geographical features. We developed a three-stage model using XGBoost, a machine learning algorithm, to quantify the probability of COVID-19 occurrence and estimate the number of potential occurrences for unaffected counties. Finally, these results were combined to predict the county-level risk. This risk was then used as an estimated after-five-day-vulnerability of the county. RESULTS: The model predictions showed a sensitivity over 71% and specificity over 94% for models built using data from March 14 to 31, 2020. We found that population, population density, percentage of people aged >70 years, and prevalence of comorbidities play an important role in predicting COVID-19 occurrences. We observed a positive association at the county level between urbanicity and vulnerability to COVID-19. CONCLUSIONS: The developed model can be used for identification of vulnerable counties and potential data discrepancies. Limited testing facilities and delayed results introduce significant variation in reported cases, which produces a bias in the model.
format Online
Article
Text
id pubmed-7490002
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-74900022020-10-01 Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach Mehta, Mihir Julaiti, Juxihong Griffin, Paul Kumara, Soundar JMIR Public Health Surveill Original Paper BACKGROUND: The rapid spread of COVID-19 means that government and health services providers have little time to plan and design effective response policies. It is therefore important to quickly provide accurate predictions of how vulnerable geographic regions such as counties are to the spread of this virus. OBJECTIVE: The aim of this study is to develop county-level prediction around near future disease movement for COVID-19 occurrences using publicly available data. METHODS: We estimated county-level COVID-19 occurrences for the period March 14 to 31, 2020, based on data fused from multiple publicly available sources inclusive of health statistics, demographics, and geographical features. We developed a three-stage model using XGBoost, a machine learning algorithm, to quantify the probability of COVID-19 occurrence and estimate the number of potential occurrences for unaffected counties. Finally, these results were combined to predict the county-level risk. This risk was then used as an estimated after-five-day-vulnerability of the county. RESULTS: The model predictions showed a sensitivity over 71% and specificity over 94% for models built using data from March 14 to 31, 2020. We found that population, population density, percentage of people aged >70 years, and prevalence of comorbidities play an important role in predicting COVID-19 occurrences. We observed a positive association at the county level between urbanicity and vulnerability to COVID-19. CONCLUSIONS: The developed model can be used for identification of vulnerable counties and potential data discrepancies. Limited testing facilities and delayed results introduce significant variation in reported cases, which produces a bias in the model. JMIR Publications 2020-09-11 /pmc/articles/PMC7490002/ /pubmed/32784193 http://dx.doi.org/10.2196/19446 Text en ©Mihir Mehta, Juxihong Julaiti, Paul Griffin, Soundar Kumara. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 11.09.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Mehta, Mihir
Julaiti, Juxihong
Griffin, Paul
Kumara, Soundar
Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach
title Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach
title_full Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach
title_fullStr Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach
title_full_unstemmed Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach
title_short Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach
title_sort early stage machine learning–based prediction of us county vulnerability to the covid-19 pandemic: machine learning approach
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7490002/
https://www.ncbi.nlm.nih.gov/pubmed/32784193
http://dx.doi.org/10.2196/19446
work_keys_str_mv AT mehtamihir earlystagemachinelearningbasedpredictionofuscountyvulnerabilitytothecovid19pandemicmachinelearningapproach
AT julaitijuxihong earlystagemachinelearningbasedpredictionofuscountyvulnerabilitytothecovid19pandemicmachinelearningapproach
AT griffinpaul earlystagemachinelearningbasedpredictionofuscountyvulnerabilitytothecovid19pandemicmachinelearningapproach
AT kumarasoundar earlystagemachinelearningbasedpredictionofuscountyvulnerabilitytothecovid19pandemicmachinelearningapproach