Cargando…

Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries

BACKGROUND: Novel influenza surveillance systems that leverage Internet-based real-time data sources including Internet search frequencies, social-network information, and crowd-sourced flu surveillance tools have shown improved accuracy over the past few years in data-rich countries like the United...

Descripción completa

Detalles Bibliográficos
Autores principales: Clemente, Leonardo, Lu, Fred, Santillana, Mauricio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6470460/
https://www.ncbi.nlm.nih.gov/pubmed/30946017
http://dx.doi.org/10.2196/12214
_version_ 1783411802141360128
author Clemente, Leonardo
Lu, Fred
Santillana, Mauricio
author_facet Clemente, Leonardo
Lu, Fred
Santillana, Mauricio
author_sort Clemente, Leonardo
collection PubMed
description BACKGROUND: Novel influenza surveillance systems that leverage Internet-based real-time data sources including Internet search frequencies, social-network information, and crowd-sourced flu surveillance tools have shown improved accuracy over the past few years in data-rich countries like the United States. These systems not only track flu activity accurately, but they also report flu estimates a week or more ahead of the publication of reports produced by healthcare-based systems, such as those implemented and managed by the Centers for Disease Control and Prevention. Previous work has shown that the predictive capabilities of novel flu surveillance systems, like Google Flu Trends (GFT), in developing countries in Latin America have not yet delivered acceptable flu estimates. OBJECTIVE: The aim of this study was to show that recent methodological improvements on the use of Internet search engine information to track diseases can lead to improved retrospective flu estimates in multiple countries in Latin America. METHODS: A machine learning-based methodology that uses flu-related Internet search activity and historical information to monitor flu activity, named ARGO (AutoRegression with Google search), was extended to generate flu predictions for 8 Latin American countries (Argentina, Bolivia, Brazil, Chile, Mexico, Paraguay, Peru, and Uruguay) for the time period: January 2012 to December of 2016. These retrospective (out-of-sample) Influenza activity predictions were compared with historically observed flu suspected cases in each country, as reported by Flunet, an influenza surveillance database maintained by the World Health Organization. For a baseline comparison, retrospective (out-of-sample) flu estimates were produced for the same time period using autoregressive models that only leverage historical flu activity information. RESULTS: Our results show that ARGO-like models’ predictive power outperform autoregressive models in 6 out of 8 countries in the 2012-2016 time period. Moreover, ARGO significantly improves on historical flu estimates produced by the now discontinued GFT for the time period of 2012-2015, where GFT information is publicly available. CONCLUSIONS: We demonstrate here that a self-correcting machine learning method, leveraging Internet-based disease-related search activity and historical flu trends, has the potential to produce reliable and timely flu estimates in multiple Latin American countries. This methodology may prove helpful to local public health officials who design and implement interventions aimed at mitigating the effects of influenza outbreaks. Our methodology generally outperforms both the now-discontinued tool GFT, and autoregressive methodologies that exploit only historical flu activity to produce future disease estimates.
format Online
Article
Text
id pubmed-6470460
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-64704602019-05-08 Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries Clemente, Leonardo Lu, Fred Santillana, Mauricio JMIR Public Health Surveill Original Paper BACKGROUND: Novel influenza surveillance systems that leverage Internet-based real-time data sources including Internet search frequencies, social-network information, and crowd-sourced flu surveillance tools have shown improved accuracy over the past few years in data-rich countries like the United States. These systems not only track flu activity accurately, but they also report flu estimates a week or more ahead of the publication of reports produced by healthcare-based systems, such as those implemented and managed by the Centers for Disease Control and Prevention. Previous work has shown that the predictive capabilities of novel flu surveillance systems, like Google Flu Trends (GFT), in developing countries in Latin America have not yet delivered acceptable flu estimates. OBJECTIVE: The aim of this study was to show that recent methodological improvements on the use of Internet search engine information to track diseases can lead to improved retrospective flu estimates in multiple countries in Latin America. METHODS: A machine learning-based methodology that uses flu-related Internet search activity and historical information to monitor flu activity, named ARGO (AutoRegression with Google search), was extended to generate flu predictions for 8 Latin American countries (Argentina, Bolivia, Brazil, Chile, Mexico, Paraguay, Peru, and Uruguay) for the time period: January 2012 to December of 2016. These retrospective (out-of-sample) Influenza activity predictions were compared with historically observed flu suspected cases in each country, as reported by Flunet, an influenza surveillance database maintained by the World Health Organization. For a baseline comparison, retrospective (out-of-sample) flu estimates were produced for the same time period using autoregressive models that only leverage historical flu activity information. RESULTS: Our results show that ARGO-like models’ predictive power outperform autoregressive models in 6 out of 8 countries in the 2012-2016 time period. Moreover, ARGO significantly improves on historical flu estimates produced by the now discontinued GFT for the time period of 2012-2015, where GFT information is publicly available. CONCLUSIONS: We demonstrate here that a self-correcting machine learning method, leveraging Internet-based disease-related search activity and historical flu trends, has the potential to produce reliable and timely flu estimates in multiple Latin American countries. This methodology may prove helpful to local public health officials who design and implement interventions aimed at mitigating the effects of influenza outbreaks. Our methodology generally outperforms both the now-discontinued tool GFT, and autoregressive methodologies that exploit only historical flu activity to produce future disease estimates. JMIR Publications 2019-04-04 /pmc/articles/PMC6470460/ /pubmed/30946017 http://dx.doi.org/10.2196/12214 Text en ©Leonardo Clemente, Fred Lu, Mauricio Santillana. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 04.04.2019. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Clemente, Leonardo
Lu, Fred
Santillana, Mauricio
Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries
title Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries
title_full Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries
title_fullStr Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries
title_full_unstemmed Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries
title_short Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries
title_sort improved real-time influenza surveillance: using internet search data in eight latin american countries
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6470460/
https://www.ncbi.nlm.nih.gov/pubmed/30946017
http://dx.doi.org/10.2196/12214
work_keys_str_mv AT clementeleonardo improvedrealtimeinfluenzasurveillanceusinginternetsearchdataineightlatinamericancountries
AT lufred improvedrealtimeinfluenzasurveillanceusinginternetsearchdataineightlatinamericancountries
AT santillanamauricio improvedrealtimeinfluenzasurveillanceusinginternetsearchdataineightlatinamericancountries