Cargando…
A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data
During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Ltd.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8731233/ https://www.ncbi.nlm.nih.gov/pubmed/35013654 http://dx.doi.org/10.1016/j.chaos.2021.111779 |
_version_ | 1784627314879889408 |
---|---|
author | Galasso, Joseph Cao, Duy M. Hochberg, Robert |
author_facet | Galasso, Joseph Cao, Duy M. Hochberg, Robert |
author_sort | Galasso, Joseph |
collection | PubMed |
description | During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric random forest model that forecasts the number of COVID-19 cases at the U.S. county level. Its most prioritized training features are derived from easily accessible, standard epidemiological data (i.e., regional test positivity rate) and the effective reproduction number ([Formula: see text]) from compartmental models. A novel input training feature is case projections generated by aligning estimated effective reproduction number (pre-computed by COVIDActNow.org) with real time testing data until maximally correlated, helping our model fit better to the epidemic’s trajectory as ascertained by traditional models. Poor reliability of [Formula: see text] is partially mitigated with dynamic population mobility and prevalence and mortality of non-COVID-19 diseases to gauge population disease susceptibility. The model was used to generate forecasts for 1, 2, 3, and 4 weeks into the future for each reference week within 11/01/2020 - 01/10/2021 for 3068 counties. Over this time period, it maintained a mean absolute error (MAE) of less than 300 weekly cases/100,000 and consistently outperformed or performed comparably with gold-standard compartmental models. Furthermore, it holds great potential in ensemble modeling due to its potential for a more expansive training feature set while maintaining good performance and limited resource utilization. |
format | Online Article Text |
id | pubmed-8731233 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-87312332022-01-06 A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data Galasso, Joseph Cao, Duy M. Hochberg, Robert Chaos Solitons Fractals Article During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric random forest model that forecasts the number of COVID-19 cases at the U.S. county level. Its most prioritized training features are derived from easily accessible, standard epidemiological data (i.e., regional test positivity rate) and the effective reproduction number ([Formula: see text]) from compartmental models. A novel input training feature is case projections generated by aligning estimated effective reproduction number (pre-computed by COVIDActNow.org) with real time testing data until maximally correlated, helping our model fit better to the epidemic’s trajectory as ascertained by traditional models. Poor reliability of [Formula: see text] is partially mitigated with dynamic population mobility and prevalence and mortality of non-COVID-19 diseases to gauge population disease susceptibility. The model was used to generate forecasts for 1, 2, 3, and 4 weeks into the future for each reference week within 11/01/2020 - 01/10/2021 for 3068 counties. Over this time period, it maintained a mean absolute error (MAE) of less than 300 weekly cases/100,000 and consistently outperformed or performed comparably with gold-standard compartmental models. Furthermore, it holds great potential in ensemble modeling due to its potential for a more expansive training feature set while maintaining good performance and limited resource utilization. Elsevier Ltd. 2022-03 2022-01-05 /pmc/articles/PMC8731233/ /pubmed/35013654 http://dx.doi.org/10.1016/j.chaos.2021.111779 Text en © 2022 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Galasso, Joseph Cao, Duy M. Hochberg, Robert A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data |
title | A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data |
title_full | A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data |
title_fullStr | A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data |
title_full_unstemmed | A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data |
title_short | A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data |
title_sort | random forest model for forecasting regional covid-19 cases utilizing reproduction number estimates and demographic data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8731233/ https://www.ncbi.nlm.nih.gov/pubmed/35013654 http://dx.doi.org/10.1016/j.chaos.2021.111779 |
work_keys_str_mv | AT galassojoseph arandomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata AT caoduym arandomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata AT hochbergrobert arandomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata AT galassojoseph randomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata AT caoduym randomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata AT hochbergrobert randomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata |