Cargando…
An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data
The outbreak of the SARS-CoV-2 novel coronavirus has caused a health crisis of immeasurable magnitude. Signals from heterogeneous public data sources could serve as early predictors for infection waves of the pandemic, particularly in its early phases, when infection data was scarce. In this article...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Authors. Published by Elsevier B.V.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040481/ https://www.ncbi.nlm.nih.gov/pubmed/35496673 http://dx.doi.org/10.1016/j.patrec.2022.04.030 |
_version_ | 1784694345698377728 |
---|---|
author | Liu, Zhimin Jiang, Zuodong Kip, Geoffrey Snigdha, Kirti Xu, Jennings Wu, Xiaoying Khan, Najat Schultz, Timothy |
author_facet | Liu, Zhimin Jiang, Zuodong Kip, Geoffrey Snigdha, Kirti Xu, Jennings Wu, Xiaoying Khan, Najat Schultz, Timothy |
author_sort | Liu, Zhimin |
collection | PubMed |
description | The outbreak of the SARS-CoV-2 novel coronavirus has caused a health crisis of immeasurable magnitude. Signals from heterogeneous public data sources could serve as early predictors for infection waves of the pandemic, particularly in its early phases, when infection data was scarce. In this article, we characterize temporal pandemic indicators by leveraging an integrated set of public data and apply them to a Prophet model to predict COVID-19 trends. An effective natural language processing pipeline was first built to extract time-series signals of specific articles from a news corpus. Bursts of these temporal signals were further identified with Kleinberg's burst detection algorithm. Across different US states, correlations for Google Trends of COVID-19 related terms, COVID-19 news volume, and publicly available wastewater SARS-CoV-2 measurements with weekly COVID-19 case numbers were generally high with lags ranging from 0 to 3 weeks, indicating them as strong predictors of viral spread. Incorporating time-series signals of these effective predictors significantly improved the performance of the Prophet model, which was able to predict the COVID-19 case numbers between one and two weeks with average mean absolute error rates of 0.38 and 0.46 respectively across different states |
format | Online Article Text |
id | pubmed-9040481 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | The Authors. Published by Elsevier B.V. |
record_format | MEDLINE/PubMed |
spelling | pubmed-90404812022-04-26 An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data Liu, Zhimin Jiang, Zuodong Kip, Geoffrey Snigdha, Kirti Xu, Jennings Wu, Xiaoying Khan, Najat Schultz, Timothy Pattern Recognit Lett Article The outbreak of the SARS-CoV-2 novel coronavirus has caused a health crisis of immeasurable magnitude. Signals from heterogeneous public data sources could serve as early predictors for infection waves of the pandemic, particularly in its early phases, when infection data was scarce. In this article, we characterize temporal pandemic indicators by leveraging an integrated set of public data and apply them to a Prophet model to predict COVID-19 trends. An effective natural language processing pipeline was first built to extract time-series signals of specific articles from a news corpus. Bursts of these temporal signals were further identified with Kleinberg's burst detection algorithm. Across different US states, correlations for Google Trends of COVID-19 related terms, COVID-19 news volume, and publicly available wastewater SARS-CoV-2 measurements with weekly COVID-19 case numbers were generally high with lags ranging from 0 to 3 weeks, indicating them as strong predictors of viral spread. Incorporating time-series signals of these effective predictors significantly improved the performance of the Prophet model, which was able to predict the COVID-19 case numbers between one and two weeks with average mean absolute error rates of 0.38 and 0.46 respectively across different states The Authors. Published by Elsevier B.V. 2022-06 2022-04-26 /pmc/articles/PMC9040481/ /pubmed/35496673 http://dx.doi.org/10.1016/j.patrec.2022.04.030 Text en © 2022 The Authors. Published by Elsevier B.V. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Liu, Zhimin Jiang, Zuodong Kip, Geoffrey Snigdha, Kirti Xu, Jennings Wu, Xiaoying Khan, Najat Schultz, Timothy An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data |
title | An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data |
title_full | An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data |
title_fullStr | An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data |
title_full_unstemmed | An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data |
title_short | An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data |
title_sort | infodemiological framework for tracking the spread of sars-cov-2 using integrated public data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040481/ https://www.ncbi.nlm.nih.gov/pubmed/35496673 http://dx.doi.org/10.1016/j.patrec.2022.04.030 |
work_keys_str_mv | AT liuzhimin aninfodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT jiangzuodong aninfodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT kipgeoffrey aninfodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT snigdhakirti aninfodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT xujennings aninfodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT wuxiaoying aninfodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT khannajat aninfodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT schultztimothy aninfodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT liuzhimin infodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT jiangzuodong infodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT kipgeoffrey infodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT snigdhakirti infodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT xujennings infodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT wuxiaoying infodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT khannajat infodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata AT schultztimothy infodemiologicalframeworkfortrackingthespreadofsarscov2usingintegratedpublicdata |