Cargando…
Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea
BACKGROUND: As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to b...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4949385/ https://www.ncbi.nlm.nih.gov/pubmed/27377323 http://dx.doi.org/10.2196/jmir.4955 |
_version_ | 1782443419432386560 |
---|---|
author | Woo, Hyekyung Cho, Youngtae Shim, Eunyoung Lee, Jong-Koo Lee, Chang-Gun Kim, Seong Hwan |
author_facet | Woo, Hyekyung Cho, Youngtae Shim, Eunyoung Lee, Jong-Koo Lee, Chang-Gun Kim, Seong Hwan |
author_sort | Woo, Hyekyung |
collection | PubMed |
description | BACKGROUND: As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. OBJECTIVE: In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. METHODS: Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. RESULTS: In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; P<.001) and virological incidence rate (r=.963; P<.001). CONCLUSIONS: These results demonstrate the feasibility of using search queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using social media data seems ideal for supporting influenza surveillance based on search query data. |
format | Online Article Text |
id | pubmed-4949385 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-49493852016-08-03 Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea Woo, Hyekyung Cho, Youngtae Shim, Eunyoung Lee, Jong-Koo Lee, Chang-Gun Kim, Seong Hwan J Med Internet Res Original Paper BACKGROUND: As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. OBJECTIVE: In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. METHODS: Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. RESULTS: In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; P<.001) and virological incidence rate (r=.963; P<.001). CONCLUSIONS: These results demonstrate the feasibility of using search queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using social media data seems ideal for supporting influenza surveillance based on search query data. JMIR Publications 2016-07-04 /pmc/articles/PMC4949385/ /pubmed/27377323 http://dx.doi.org/10.2196/jmir.4955 Text en ©Hyekyung Woo, Youngtae Cho, Eunyoung Shim, Jong-Koo Lee, Chang-Gun Lee, Seong Hwan Kim. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 04.07.2016. https://creativecommons.org/licenses/by/2.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/ (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Woo, Hyekyung Cho, Youngtae Shim, Eunyoung Lee, Jong-Koo Lee, Chang-Gun Kim, Seong Hwan Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea |
title | Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea |
title_full | Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea |
title_fullStr | Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea |
title_full_unstemmed | Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea |
title_short | Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea |
title_sort | estimating influenza outbreaks using both search engine query data and social media data in south korea |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4949385/ https://www.ncbi.nlm.nih.gov/pubmed/27377323 http://dx.doi.org/10.2196/jmir.4955 |
work_keys_str_mv | AT woohyekyung estimatinginfluenzaoutbreaksusingbothsearchenginequerydataandsocialmediadatainsouthkorea AT choyoungtae estimatinginfluenzaoutbreaksusingbothsearchenginequerydataandsocialmediadatainsouthkorea AT shimeunyoung estimatinginfluenzaoutbreaksusingbothsearchenginequerydataandsocialmediadatainsouthkorea AT leejongkoo estimatinginfluenzaoutbreaksusingbothsearchenginequerydataandsocialmediadatainsouthkorea AT leechanggun estimatinginfluenzaoutbreaksusingbothsearchenginequerydataandsocialmediadatainsouthkorea AT kimseonghwan estimatinginfluenzaoutbreaksusingbothsearchenginequerydataandsocialmediadatainsouthkorea |