Cargando…

Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic

BACKGROUND: Zika virus (ZIKV) is an emerging mosquito-borne arbovirus that can produce serious public health consequences. In 2016, ZIKV caused an epidemic in many countries around the world, including the United States. ZIKV surveillance and vector control is essential to combating future epidemics...

Descripción completa

Detalles Bibliográficos
Autores principales: Masri, Shahir, Jia, Jianfeng, Li, Chen, Zhou, Guofa, Lee, Ming-Chieh, Yan, Guiyun, Wu, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6570872/
https://www.ncbi.nlm.nih.gov/pubmed/31200692
http://dx.doi.org/10.1186/s12889-019-7103-8
_version_ 1783427315640827904
author Masri, Shahir
Jia, Jianfeng
Li, Chen
Zhou, Guofa
Lee, Ming-Chieh
Yan, Guiyun
Wu, Jun
author_facet Masri, Shahir
Jia, Jianfeng
Li, Chen
Zhou, Guofa
Lee, Ming-Chieh
Yan, Guiyun
Wu, Jun
author_sort Masri, Shahir
collection PubMed
description BACKGROUND: Zika virus (ZIKV) is an emerging mosquito-borne arbovirus that can produce serious public health consequences. In 2016, ZIKV caused an epidemic in many countries around the world, including the United States. ZIKV surveillance and vector control is essential to combating future epidemics. However, challenges relating to the timely publication of case reports significantly limit the effectiveness of current surveillance methods. In many countries with poor infrastructure, established systems for case reporting often do not exist. Previous studies investigating the H1N1 pandemic, general influenza and the recent Ebola outbreak have demonstrated that time- and geo-tagged Twitter data, which is immediately available, can be utilized to overcome these limitations. METHODS: In this study, we employed a recently developed system called Cloudberry to filter a random sample of Twitter data to investigate the feasibility of using such data for ZIKV epidemic tracking on a national and state (Florida) level. Two auto-regressive models were calibrated using weekly ZIKV case counts and zika tweets in order to estimate weekly ZIKV cases 1 week in advance. RESULTS: While models tended to over-predict at low case counts and under-predict at extreme high counts, a comparison of predicted versus observed weekly ZIKV case counts following model calibration demonstrated overall reasonable predictive accuracy, with an R(2) of 0.74 for the Florida model and 0.70 for the U.S. model. Time-series analysis of predicted and observed ZIKV cases following internal cross-validation exhibited very similar patterns, demonstrating reasonable model performance. Spatially, the distribution of cumulative ZIKV case counts (local- & travel-related) and zika tweets across all 50 U.S. states showed a high correlation (r = 0.73) after adjusting for population. CONCLUSIONS: This study demonstrates the value of utilizing Twitter data for the purposes of disease surveillance. This is of high value to epidemiologist and public health officials charged with protecting the public during future outbreaks. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12889-019-7103-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6570872
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65708722019-06-27 Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic Masri, Shahir Jia, Jianfeng Li, Chen Zhou, Guofa Lee, Ming-Chieh Yan, Guiyun Wu, Jun BMC Public Health Research Article BACKGROUND: Zika virus (ZIKV) is an emerging mosquito-borne arbovirus that can produce serious public health consequences. In 2016, ZIKV caused an epidemic in many countries around the world, including the United States. ZIKV surveillance and vector control is essential to combating future epidemics. However, challenges relating to the timely publication of case reports significantly limit the effectiveness of current surveillance methods. In many countries with poor infrastructure, established systems for case reporting often do not exist. Previous studies investigating the H1N1 pandemic, general influenza and the recent Ebola outbreak have demonstrated that time- and geo-tagged Twitter data, which is immediately available, can be utilized to overcome these limitations. METHODS: In this study, we employed a recently developed system called Cloudberry to filter a random sample of Twitter data to investigate the feasibility of using such data for ZIKV epidemic tracking on a national and state (Florida) level. Two auto-regressive models were calibrated using weekly ZIKV case counts and zika tweets in order to estimate weekly ZIKV cases 1 week in advance. RESULTS: While models tended to over-predict at low case counts and under-predict at extreme high counts, a comparison of predicted versus observed weekly ZIKV case counts following model calibration demonstrated overall reasonable predictive accuracy, with an R(2) of 0.74 for the Florida model and 0.70 for the U.S. model. Time-series analysis of predicted and observed ZIKV cases following internal cross-validation exhibited very similar patterns, demonstrating reasonable model performance. Spatially, the distribution of cumulative ZIKV case counts (local- & travel-related) and zika tweets across all 50 U.S. states showed a high correlation (r = 0.73) after adjusting for population. CONCLUSIONS: This study demonstrates the value of utilizing Twitter data for the purposes of disease surveillance. This is of high value to epidemiologist and public health officials charged with protecting the public during future outbreaks. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12889-019-7103-8) contains supplementary material, which is available to authorized users. BioMed Central 2019-06-14 /pmc/articles/PMC6570872/ /pubmed/31200692 http://dx.doi.org/10.1186/s12889-019-7103-8 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Masri, Shahir
Jia, Jianfeng
Li, Chen
Zhou, Guofa
Lee, Ming-Chieh
Yan, Guiyun
Wu, Jun
Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic
title Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic
title_full Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic
title_fullStr Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic
title_full_unstemmed Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic
title_short Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic
title_sort use of twitter data to improve zika virus surveillance in the united states during the 2016 epidemic
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6570872/
https://www.ncbi.nlm.nih.gov/pubmed/31200692
http://dx.doi.org/10.1186/s12889-019-7103-8
work_keys_str_mv AT masrishahir useoftwitterdatatoimprovezikavirussurveillanceintheunitedstatesduringthe2016epidemic
AT jiajianfeng useoftwitterdatatoimprovezikavirussurveillanceintheunitedstatesduringthe2016epidemic
AT lichen useoftwitterdatatoimprovezikavirussurveillanceintheunitedstatesduringthe2016epidemic
AT zhouguofa useoftwitterdatatoimprovezikavirussurveillanceintheunitedstatesduringthe2016epidemic
AT leemingchieh useoftwitterdatatoimprovezikavirussurveillanceintheunitedstatesduringthe2016epidemic
AT yanguiyun useoftwitterdatatoimprovezikavirussurveillanceintheunitedstatesduringthe2016epidemic
AT wujun useoftwitterdatatoimprovezikavirussurveillanceintheunitedstatesduringthe2016epidemic