Cargando…
Regional Influenza Prediction with Sampling Twitter Data and PDE Model
The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7037800/ https://www.ncbi.nlm.nih.gov/pubmed/31973008 http://dx.doi.org/10.3390/ijerph17030678 |
_version_ | 1783500507266940928 |
---|---|
author | Wang, Yufang Xu, Kuai Kang, Yun Wang, Haiyan Wang, Feng Avram, Adrian |
author_facet | Wang, Yufang Xu, Kuai Kang, Yun Wang, Haiyan Wang, Feng Avram, Adrian |
author_sort | Wang, Yufang |
collection | PubMed |
description | The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a method for influenza prediction based on the real-time tweet data from social media, and this method ensures real-time prediction and is applicable to sampling data. Specifically, we first simulate the sampling process of flu tweets, and then develop a specific partial differential equation (PDE) model to characterize and predict the aggregated flu tweet volumes. Our PDE model incorporates the effects of flu spreading, flu recovery, and active human interventions for reducing flu. Our extensive simulation results show that this PDE model can almost eliminate the data reduction effects from the sampling process: It requires lesser historical data but achieves stronger prediction results with a relative accuracy of over 90% on the 1% sampling data. Even for the more aggressive data sampling ratios such as 0.1% and 0.01% sampling, our model is still able to achieve relative accuracies of 85% and 83%, respectively. These promising results highlight the ability of our mechanistic PDE model in predicting temporal–spatial patterns of flu trends even in the scenario of small sampling Twitter data. |
format | Online Article Text |
id | pubmed-7037800 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-70378002020-03-10 Regional Influenza Prediction with Sampling Twitter Data and PDE Model Wang, Yufang Xu, Kuai Kang, Yun Wang, Haiyan Wang, Feng Avram, Adrian Int J Environ Res Public Health Article The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a method for influenza prediction based on the real-time tweet data from social media, and this method ensures real-time prediction and is applicable to sampling data. Specifically, we first simulate the sampling process of flu tweets, and then develop a specific partial differential equation (PDE) model to characterize and predict the aggregated flu tweet volumes. Our PDE model incorporates the effects of flu spreading, flu recovery, and active human interventions for reducing flu. Our extensive simulation results show that this PDE model can almost eliminate the data reduction effects from the sampling process: It requires lesser historical data but achieves stronger prediction results with a relative accuracy of over 90% on the 1% sampling data. Even for the more aggressive data sampling ratios such as 0.1% and 0.01% sampling, our model is still able to achieve relative accuracies of 85% and 83%, respectively. These promising results highlight the ability of our mechanistic PDE model in predicting temporal–spatial patterns of flu trends even in the scenario of small sampling Twitter data. MDPI 2020-01-21 2020-02 /pmc/articles/PMC7037800/ /pubmed/31973008 http://dx.doi.org/10.3390/ijerph17030678 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wang, Yufang Xu, Kuai Kang, Yun Wang, Haiyan Wang, Feng Avram, Adrian Regional Influenza Prediction with Sampling Twitter Data and PDE Model |
title | Regional Influenza Prediction with Sampling Twitter Data and PDE Model |
title_full | Regional Influenza Prediction with Sampling Twitter Data and PDE Model |
title_fullStr | Regional Influenza Prediction with Sampling Twitter Data and PDE Model |
title_full_unstemmed | Regional Influenza Prediction with Sampling Twitter Data and PDE Model |
title_short | Regional Influenza Prediction with Sampling Twitter Data and PDE Model |
title_sort | regional influenza prediction with sampling twitter data and pde model |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7037800/ https://www.ncbi.nlm.nih.gov/pubmed/31973008 http://dx.doi.org/10.3390/ijerph17030678 |
work_keys_str_mv | AT wangyufang regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel AT xukuai regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel AT kangyun regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel AT wanghaiyan regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel AT wangfeng regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel AT avramadrian regionalinfluenzapredictionwithsamplingtwitterdataandpdemodel |