Cargando…

A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data

During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics...

Descripción completa

Detalles Bibliográficos
Autores principales: Galasso, Joseph, Cao, Duy M., Hochberg, Robert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8731233/
https://www.ncbi.nlm.nih.gov/pubmed/35013654
http://dx.doi.org/10.1016/j.chaos.2021.111779
_version_ 1784627314879889408
author Galasso, Joseph
Cao, Duy M.
Hochberg, Robert
author_facet Galasso, Joseph
Cao, Duy M.
Hochberg, Robert
author_sort Galasso, Joseph
collection PubMed
description During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric random forest model that forecasts the number of COVID-19 cases at the U.S. county level. Its most prioritized training features are derived from easily accessible, standard epidemiological data (i.e., regional test positivity rate) and the effective reproduction number ([Formula: see text]) from compartmental models. A novel input training feature is case projections generated by aligning estimated effective reproduction number (pre-computed by COVIDActNow.org) with real time testing data until maximally correlated, helping our model fit better to the epidemic’s trajectory as ascertained by traditional models. Poor reliability of [Formula: see text] is partially mitigated with dynamic population mobility and prevalence and mortality of non-COVID-19 diseases to gauge population disease susceptibility. The model was used to generate forecasts for 1, 2, 3, and 4 weeks into the future for each reference week within 11/01/2020 - 01/10/2021 for 3068 counties. Over this time period, it maintained a mean absolute error (MAE) of less than 300 weekly cases/100,000 and consistently outperformed or performed comparably with gold-standard compartmental models. Furthermore, it holds great potential in ensemble modeling due to its potential for a more expansive training feature set while maintaining good performance and limited resource utilization.
format Online
Article
Text
id pubmed-8731233
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-87312332022-01-06 A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data Galasso, Joseph Cao, Duy M. Hochberg, Robert Chaos Solitons Fractals Article During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric random forest model that forecasts the number of COVID-19 cases at the U.S. county level. Its most prioritized training features are derived from easily accessible, standard epidemiological data (i.e., regional test positivity rate) and the effective reproduction number ([Formula: see text]) from compartmental models. A novel input training feature is case projections generated by aligning estimated effective reproduction number (pre-computed by COVIDActNow.org) with real time testing data until maximally correlated, helping our model fit better to the epidemic’s trajectory as ascertained by traditional models. Poor reliability of [Formula: see text] is partially mitigated with dynamic population mobility and prevalence and mortality of non-COVID-19 diseases to gauge population disease susceptibility. The model was used to generate forecasts for 1, 2, 3, and 4 weeks into the future for each reference week within 11/01/2020 - 01/10/2021 for 3068 counties. Over this time period, it maintained a mean absolute error (MAE) of less than 300 weekly cases/100,000 and consistently outperformed or performed comparably with gold-standard compartmental models. Furthermore, it holds great potential in ensemble modeling due to its potential for a more expansive training feature set while maintaining good performance and limited resource utilization. Elsevier Ltd. 2022-03 2022-01-05 /pmc/articles/PMC8731233/ /pubmed/35013654 http://dx.doi.org/10.1016/j.chaos.2021.111779 Text en © 2022 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Galasso, Joseph
Cao, Duy M.
Hochberg, Robert
A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data
title A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data
title_full A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data
title_fullStr A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data
title_full_unstemmed A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data
title_short A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data
title_sort random forest model for forecasting regional covid-19 cases utilizing reproduction number estimates and demographic data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8731233/
https://www.ncbi.nlm.nih.gov/pubmed/35013654
http://dx.doi.org/10.1016/j.chaos.2021.111779
work_keys_str_mv AT galassojoseph arandomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata
AT caoduym arandomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata
AT hochbergrobert arandomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata
AT galassojoseph randomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata
AT caoduym randomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata
AT hochbergrobert randomforestmodelforforecastingregionalcovid19casesutilizingreproductionnumberestimatesanddemographicdata