Cargando…

An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data

The outbreak of the SARS-CoV-2 novel coronavirus has caused a health crisis of immeasurable magnitude. Signals from heterogeneous public data sources could serve as early predictors for infection waves of the pandemic, particularly in its early phases, when infection data was scarce. In this article...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Zhimin, Jiang, Zuodong, Kip, Geoffrey, Snigdha, Kirti, Xu, Jennings, Wu, Xiaoying, Khan, Najat, Schultz, Timothy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Authors. Published by Elsevier B.V. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040481/
https://www.ncbi.nlm.nih.gov/pubmed/35496673
http://dx.doi.org/10.1016/j.patrec.2022.04.030
Descripción
Sumario:The outbreak of the SARS-CoV-2 novel coronavirus has caused a health crisis of immeasurable magnitude. Signals from heterogeneous public data sources could serve as early predictors for infection waves of the pandemic, particularly in its early phases, when infection data was scarce. In this article, we characterize temporal pandemic indicators by leveraging an integrated set of public data and apply them to a Prophet model to predict COVID-19 trends. An effective natural language processing pipeline was first built to extract time-series signals of specific articles from a news corpus. Bursts of these temporal signals were further identified with Kleinberg's burst detection algorithm. Across different US states, correlations for Google Trends of COVID-19 related terms, COVID-19 news volume, and publicly available wastewater SARS-CoV-2 measurements with weekly COVID-19 case numbers were generally high with lags ranging from 0 to 3 weeks, indicating them as strong predictors of viral spread. Incorporating time-series signals of these effective predictors significantly improved the performance of the Prophet model, which was able to predict the COVID-19 case numbers between one and two weeks with average mean absolute error rates of 0.38 and 0.46 respectively across different states