Cargando…

Cumulative Query Method for Influenza Surveillance Using Search Engine Data

BACKGROUND: Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. OBJECTIVES: The objective of this study was to examine correlations between our cum...

Descripción completa

Detalles Bibliográficos
Autores principales: Seo, Dong-Woo, Jo, Min-Woo, Sohn, Chang Hwan, Shin, Soo-Yong, Lee, JaeHo, Yu, Maengsoo, Kim, Won Young, Lim, Kyoung Soo, Lee, Sang-Il
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications Inc. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4275481/
https://www.ncbi.nlm.nih.gov/pubmed/25517353
http://dx.doi.org/10.2196/jmir.3680
_version_ 1782350131976208384
author Seo, Dong-Woo
Jo, Min-Woo
Sohn, Chang Hwan
Shin, Soo-Yong
Lee, JaeHo
Yu, Maengsoo
Kim, Won Young
Lim, Kyoung Soo
Lee, Sang-Il
author_facet Seo, Dong-Woo
Jo, Min-Woo
Sohn, Chang Hwan
Shin, Soo-Yong
Lee, JaeHo
Yu, Maengsoo
Kim, Won Young
Lim, Kyoung Soo
Lee, Sang-Il
author_sort Seo, Dong-Woo
collection PubMed
description BACKGROUND: Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. OBJECTIVES: The objective of this study was to examine correlations between our cumulative query method and national influenza surveillance data. METHODS: Our study was based on the local search engine, Daum (approximately 25% market share), and influenza-like illness (ILI) data from the Korea Centers for Disease Control and Prevention. A quota sampling survey was conducted with 200 participants to obtain popular queries. We divided the study period into two sets: Set 1 (the 2009/10 epidemiological year for development set 1 and 2010/11 for validation set 1) and Set 2 (2010/11 for development Set 2 and 2011/12 for validation Set 2). Pearson’s correlation coefficients were calculated between the Daum data and the ILI data for the development set. We selected the combined queries for which the correlation coefficients were .7 or higher and listed them in descending order. Then, we created a cumulative query method n representing the number of cumulative combined queries in descending order of the correlation coefficient. RESULTS: In validation set 1, 13 cumulative query methods were applied, and 8 had higher correlation coefficients (min=.916, max=.943) than that of the highest single combined query. Further, 11 of 13 cumulative query methods had an r value of ≥.7, but 4 of 13 combined queries had an r value of ≥.7. In validation set 2, 8 of 15 cumulative query methods showed higher correlation coefficients (min=.975, max=.987) than that of the highest single combined query. All 15 cumulative query methods had an r value of ≥.7, but 6 of 15 combined queries had an r value of ≥.7. CONCLUSIONS: Cumulative query method showed relatively higher correlation with national influenza surveillance data than combined queries in the development and validation set.
format Online
Article
Text
id pubmed-4275481
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher JMIR Publications Inc.
record_format MEDLINE/PubMed
spelling pubmed-42754812014-12-26 Cumulative Query Method for Influenza Surveillance Using Search Engine Data Seo, Dong-Woo Jo, Min-Woo Sohn, Chang Hwan Shin, Soo-Yong Lee, JaeHo Yu, Maengsoo Kim, Won Young Lim, Kyoung Soo Lee, Sang-Il J Med Internet Res Original Paper BACKGROUND: Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. OBJECTIVES: The objective of this study was to examine correlations between our cumulative query method and national influenza surveillance data. METHODS: Our study was based on the local search engine, Daum (approximately 25% market share), and influenza-like illness (ILI) data from the Korea Centers for Disease Control and Prevention. A quota sampling survey was conducted with 200 participants to obtain popular queries. We divided the study period into two sets: Set 1 (the 2009/10 epidemiological year for development set 1 and 2010/11 for validation set 1) and Set 2 (2010/11 for development Set 2 and 2011/12 for validation Set 2). Pearson’s correlation coefficients were calculated between the Daum data and the ILI data for the development set. We selected the combined queries for which the correlation coefficients were .7 or higher and listed them in descending order. Then, we created a cumulative query method n representing the number of cumulative combined queries in descending order of the correlation coefficient. RESULTS: In validation set 1, 13 cumulative query methods were applied, and 8 had higher correlation coefficients (min=.916, max=.943) than that of the highest single combined query. Further, 11 of 13 cumulative query methods had an r value of ≥.7, but 4 of 13 combined queries had an r value of ≥.7. In validation set 2, 8 of 15 cumulative query methods showed higher correlation coefficients (min=.975, max=.987) than that of the highest single combined query. All 15 cumulative query methods had an r value of ≥.7, but 6 of 15 combined queries had an r value of ≥.7. CONCLUSIONS: Cumulative query method showed relatively higher correlation with national influenza surveillance data than combined queries in the development and validation set. JMIR Publications Inc. 2014-12-16 /pmc/articles/PMC4275481/ /pubmed/25517353 http://dx.doi.org/10.2196/jmir.3680 Text en ©Dong-Woo Seo, Min-Woo Jo, Chang Hwan Sohn, Soo-Yong Shin, JaeHo Lee, Maengsoo Yu, Won Young Kim, Kyoung Soo Lim, Sang-Il Lee. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 16.12.2014. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Seo, Dong-Woo
Jo, Min-Woo
Sohn, Chang Hwan
Shin, Soo-Yong
Lee, JaeHo
Yu, Maengsoo
Kim, Won Young
Lim, Kyoung Soo
Lee, Sang-Il
Cumulative Query Method for Influenza Surveillance Using Search Engine Data
title Cumulative Query Method for Influenza Surveillance Using Search Engine Data
title_full Cumulative Query Method for Influenza Surveillance Using Search Engine Data
title_fullStr Cumulative Query Method for Influenza Surveillance Using Search Engine Data
title_full_unstemmed Cumulative Query Method for Influenza Surveillance Using Search Engine Data
title_short Cumulative Query Method for Influenza Surveillance Using Search Engine Data
title_sort cumulative query method for influenza surveillance using search engine data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4275481/
https://www.ncbi.nlm.nih.gov/pubmed/25517353
http://dx.doi.org/10.2196/jmir.3680
work_keys_str_mv AT seodongwoo cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata
AT jominwoo cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata
AT sohnchanghwan cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata
AT shinsooyong cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata
AT leejaeho cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata
AT yumaengsoo cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata
AT kimwonyoung cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata
AT limkyoungsoo cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata
AT leesangil cumulativequerymethodforinfluenzasurveillanceusingsearchenginedata