Cargando…

Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015

BACKGROUND: Influenza epidemics pose significant social and economic challenges in China. Internet search query data have been identified as a valuable source for the detection of emerging influenza epidemics. However, the selection of the search queries and the adoption of prediction methods are cr...

Descripción completa

Detalles Bibliográficos
Autores principales: Liang, Feng, Guan, Peng, Wu, Wei, Huang, Desheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022725/
https://www.ncbi.nlm.nih.gov/pubmed/29967755
http://dx.doi.org/10.7717/peerj.5134
_version_ 1783335739832926208
author Liang, Feng
Guan, Peng
Wu, Wei
Huang, Desheng
author_facet Liang, Feng
Guan, Peng
Wu, Wei
Huang, Desheng
author_sort Liang, Feng
collection PubMed
description BACKGROUND: Influenza epidemics pose significant social and economic challenges in China. Internet search query data have been identified as a valuable source for the detection of emerging influenza epidemics. However, the selection of the search queries and the adoption of prediction methods are crucial challenges when it comes to improving predictions. The purpose of this study was to explore the application of the Support Vector Machine (SVM) regression model in merging search engine query data and traditional influenza data. METHODS: The official monthly reported number of influenza cases in Liaoning province in China was acquired from the China National Scientific Data Center for Public Health from January 2011 to December 2015. Based on Baidu Index, a publicly available search engine database, search queries potentially related to influenza over the corresponding period were identified. An SVM regression model was built to be used for predictions, and the choice of three parameters (C, γ, ε) in the SVM regression model was determined by leave-one-out cross-validation (LOOCV) during the model construction process. The model’s performance was evaluated by the evaluation metrics including Root Mean Square Error, Root Mean Square Percentage Error and Mean Absolute Percentage Error. RESULTS: In total, 17 search queries related to influenza were generated through the initial query selection approach and were adopted to construct the SVM regression model, including nine queries in the same month, three queries at a lag of one month, one query at a lag of two months and four queries at a lag of three months. The SVM model performed well when with the parameters (C = 2, γ = 0.005, ɛ = 0.0001), based on the ensemble data integrating the influenza surveillance data and Baidu search query data. CONCLUSIONS: The results demonstrated the feasibility of using internet search engine query data as the complementary data source for influenza surveillance and the efficiency of SVM regression model in tracking the influenza epidemics in Liaoning.
format Online
Article
Text
id pubmed-6022725
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-60227252018-07-02 Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015 Liang, Feng Guan, Peng Wu, Wei Huang, Desheng PeerJ Health Policy BACKGROUND: Influenza epidemics pose significant social and economic challenges in China. Internet search query data have been identified as a valuable source for the detection of emerging influenza epidemics. However, the selection of the search queries and the adoption of prediction methods are crucial challenges when it comes to improving predictions. The purpose of this study was to explore the application of the Support Vector Machine (SVM) regression model in merging search engine query data and traditional influenza data. METHODS: The official monthly reported number of influenza cases in Liaoning province in China was acquired from the China National Scientific Data Center for Public Health from January 2011 to December 2015. Based on Baidu Index, a publicly available search engine database, search queries potentially related to influenza over the corresponding period were identified. An SVM regression model was built to be used for predictions, and the choice of three parameters (C, γ, ε) in the SVM regression model was determined by leave-one-out cross-validation (LOOCV) during the model construction process. The model’s performance was evaluated by the evaluation metrics including Root Mean Square Error, Root Mean Square Percentage Error and Mean Absolute Percentage Error. RESULTS: In total, 17 search queries related to influenza were generated through the initial query selection approach and were adopted to construct the SVM regression model, including nine queries in the same month, three queries at a lag of one month, one query at a lag of two months and four queries at a lag of three months. The SVM model performed well when with the parameters (C = 2, γ = 0.005, ɛ = 0.0001), based on the ensemble data integrating the influenza surveillance data and Baidu search query data. CONCLUSIONS: The results demonstrated the feasibility of using internet search engine query data as the complementary data source for influenza surveillance and the efficiency of SVM regression model in tracking the influenza epidemics in Liaoning. PeerJ Inc. 2018-06-25 /pmc/articles/PMC6022725/ /pubmed/29967755 http://dx.doi.org/10.7717/peerj.5134 Text en © 2018 Liang et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Health Policy
Liang, Feng
Guan, Peng
Wu, Wei
Huang, Desheng
Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015
title Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015
title_full Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015
title_fullStr Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015
title_full_unstemmed Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015
title_short Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015
title_sort forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in liaoning, from 2011 to 2015
topic Health Policy
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022725/
https://www.ncbi.nlm.nih.gov/pubmed/29967755
http://dx.doi.org/10.7717/peerj.5134
work_keys_str_mv AT liangfeng forecastinginfluenzaepidemicsbyintegratinginternetsearchqueriesandtraditionalsurveillancedatawiththesupportvectormachineregressionmodelinliaoningfrom2011to2015
AT guanpeng forecastinginfluenzaepidemicsbyintegratinginternetsearchqueriesandtraditionalsurveillancedatawiththesupportvectormachineregressionmodelinliaoningfrom2011to2015
AT wuwei forecastinginfluenzaepidemicsbyintegratinginternetsearchqueriesandtraditionalsurveillancedatawiththesupportvectormachineregressionmodelinliaoningfrom2011to2015
AT huangdesheng forecastinginfluenzaepidemicsbyintegratinginternetsearchqueriesandtraditionalsurveillancedatawiththesupportvectormachineregressionmodelinliaoningfrom2011to2015