Cargando…

Regional Influenza Prediction with Sampling Twitter Data and PDE Model

The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yufang, Xu, Kuai, Kang, Yun, Wang, Haiyan, Wang, Feng, Avram, Adrian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7037800/
https://www.ncbi.nlm.nih.gov/pubmed/31973008
http://dx.doi.org/10.3390/ijerph17030678
_version_ 1783500507266940928
author Wang, Yufang
Xu, Kuai
Kang, Yun
Wang, Haiyan
Wang, Feng
Avram, Adrian
author_facet Wang, Yufang
Xu, Kuai
Kang, Yun
Wang, Haiyan
Wang, Feng
Avram, Adrian
author_sort Wang, Yufang
collection PubMed
description The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a method for influenza prediction based on the real-time tweet data from social media, and this method ensures real-time prediction and is applicable to sampling data. Specifically, we first simulate the sampling process of flu tweets, and then develop a specific partial differential equation (PDE) model to characterize and predict the aggregated flu tweet volumes. Our PDE model incorporates the effects of flu spreading, flu recovery, and active human interventions for reducing flu. Our extensive simulation results show that this PDE model can almost eliminate the data reduction effects from the sampling process: It requires lesser historical data but achieves stronger prediction results with a relative accuracy of over 90% on the 1% sampling data. Even for the more aggressive data sampling ratios such as 0.1% and 0.01% sampling, our model is still able to achieve relative accuracies of 85% and 83%, respectively. These promising results highlight the ability of our mechanistic PDE model in predicting temporal–spatial patterns of flu trends even in the scenario of small sampling Twitter data.
format Online
Article
Text
id pubmed-7037800
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-70378002020-03-10 Regional Influenza Prediction with Sampling Twitter Data and PDE Model Wang, Yufang Xu, Kuai Kang, Yun Wang, Haiyan Wang, Feng Avram, Adrian Int J Environ Res Public Health Article The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a method for influenza prediction based on the real-time tweet data from social media, and this method ensures real-time prediction and is applicable to sampling data. Specifically, we first simulate the sampling process of flu tweets, and then develop a specific partial differential equation (PDE) model to characterize and predict the aggregated flu tweet volumes. Our PDE model incorporates the effects of flu spreading, flu recovery, and active human interventions for reducing flu. Our extensive simulation results show that this PDE model can almost eliminate the data reduction effects from the sampling process: It requires lesser historical data but achieves stronger prediction results with a relative accuracy of over 90% on the 1% sampling data. Even for the more aggressive data sampling ratios such as 0.1% and 0.01% sampling, our model is still able to achieve relative accuracies of 85% and 83%, respectively. These promising results highlight the ability of our mechanistic PDE model in predicting temporal–spatial patterns of flu trends even in the scenario of small sampling Twitter data. MDPI 2020-01-21 2020-02 /pmc/articles/PMC7037800/ /pubmed/31973008 http://dx.doi.org/10.3390/ijerph17030678 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Yufang
Xu, Kuai
Kang, Yun
Wang, Haiyan
Wang, Feng
Avram, Adrian
Regional Influenza Prediction with Sampling Twitter Data and PDE Model
title Regional Influenza Prediction with Sampling Twitter Data and PDE Model
title_full Regional Influenza Prediction with Sampling Twitter Data and PDE Model
title_fullStr Regional Influenza Prediction with Sampling Twitter Data and PDE Model
title_full_unstemmed Regional Influenza Prediction with Sampling Twitter Data and PDE Model
title_short Regional Influenza Prediction with Sampling Twitter Data and PDE Model
title_sort regional influenza prediction with sampling twitter data and pde model
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7037800/
https://www.ncbi.nlm.nih.gov/pubmed/31973008
http://dx.doi.org/10.3390/ijerph17030678
work_keys_str_mv AT wangyufang regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel
AT xukuai regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel
AT kangyun regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel
AT wanghaiyan regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel
AT wangfeng regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel
AT avramadrian regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel