Cargando…
Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies
OBJECTIVE: Predicting daily trends in the Coronavirus Disease 2019 (COVID-19) case number is important to support individual decisions in taking preventative measures. This study aims to use COVID-19 case number history, demographic characteristics, and social distancing policies both independently/...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278037/ https://www.ncbi.nlm.nih.gov/pubmed/35855422 http://dx.doi.org/10.1093/jamiaopen/ooac056 |
_version_ | 1784746115197829120 |
---|---|
author | Li, Megan Mun Pham, Anh Kuo, Tsung-Ting |
author_facet | Li, Megan Mun Pham, Anh Kuo, Tsung-Ting |
author_sort | Li, Megan Mun |
collection | PubMed |
description | OBJECTIVE: Predicting daily trends in the Coronavirus Disease 2019 (COVID-19) case number is important to support individual decisions in taking preventative measures. This study aims to use COVID-19 case number history, demographic characteristics, and social distancing policies both independently/interdependently to predict the daily trend in the rise or fall of county-level cases. MATERIALS AND METHODS: We extracted 2093 features (5 from the US COVID-19 case number history, 1824 from the demographic characteristics independently/interdependently, and 264 from the social distancing policies independently/interdependently) for 3142 US counties. Using the top selected 200 features, we built 4 machine learning models: Logistic Regression, Naïve Bayes, Multi-Layer Perceptron, and Random Forest, along with 4 Ensemble methods: Average, Product, Minimum, and Maximum, and compared their performances. RESULTS: The Ensemble Average method had the highest area-under the receiver operator characteristic curve (AUC) of 0.692. The top ranked features were all interdependent features. CONCLUSION: The findings of this study suggest the predictive power of diverse features, especially when combined, in predicting county-level trends of COVID-19 cases and can be helpful to individuals in making their daily decisions. Our results may guide future studies to consider more features interdependently from conventionally distinct data sources in county-level predictive models. Our code is available at: https://doi.org/10.5281/zenodo.6332944. |
format | Online Article Text |
id | pubmed-9278037 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-92780372022-07-18 Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies Li, Megan Mun Pham, Anh Kuo, Tsung-Ting JAMIA Open Research and Applications OBJECTIVE: Predicting daily trends in the Coronavirus Disease 2019 (COVID-19) case number is important to support individual decisions in taking preventative measures. This study aims to use COVID-19 case number history, demographic characteristics, and social distancing policies both independently/interdependently to predict the daily trend in the rise or fall of county-level cases. MATERIALS AND METHODS: We extracted 2093 features (5 from the US COVID-19 case number history, 1824 from the demographic characteristics independently/interdependently, and 264 from the social distancing policies independently/interdependently) for 3142 US counties. Using the top selected 200 features, we built 4 machine learning models: Logistic Regression, Naïve Bayes, Multi-Layer Perceptron, and Random Forest, along with 4 Ensemble methods: Average, Product, Minimum, and Maximum, and compared their performances. RESULTS: The Ensemble Average method had the highest area-under the receiver operator characteristic curve (AUC) of 0.692. The top ranked features were all interdependent features. CONCLUSION: The findings of this study suggest the predictive power of diverse features, especially when combined, in predicting county-level trends of COVID-19 cases and can be helpful to individuals in making their daily decisions. Our results may guide future studies to consider more features interdependently from conventionally distinct data sources in county-level predictive models. Our code is available at: https://doi.org/10.5281/zenodo.6332944. Oxford University Press 2022-06-25 /pmc/articles/PMC9278037/ /pubmed/35855422 http://dx.doi.org/10.1093/jamiaopen/ooac056 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research and Applications Li, Megan Mun Pham, Anh Kuo, Tsung-Ting Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies |
title | Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies |
title_full | Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies |
title_fullStr | Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies |
title_full_unstemmed | Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies |
title_short | Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies |
title_sort | predicting covid-19 county-level case number trend by combining demographic characteristics and social distancing policies |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278037/ https://www.ncbi.nlm.nih.gov/pubmed/35855422 http://dx.doi.org/10.1093/jamiaopen/ooac056 |
work_keys_str_mv | AT limeganmun predictingcovid19countylevelcasenumbertrendbycombiningdemographiccharacteristicsandsocialdistancingpolicies AT phamanh predictingcovid19countylevelcasenumbertrendbycombiningdemographiccharacteristicsandsocialdistancingpolicies AT kuotsungting predictingcovid19countylevelcasenumbertrendbycombiningdemographiccharacteristicsandsocialdistancingpolicies |