Cargando…

Using electronic health records and Internet search information for accurate influenza forecasting

BACKGROUND: Accurate influenza activity forecasting helps public health officials prepare and allocate resources for unusual influenza activity. Traditional flu surveillance systems, such as the Centers for Disease Control and Prevention’s (CDC) influenza-like illnesses reports, lag behind real-time...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Shihao, Santillana, Mauricio, Brownstein, John S., Gray, Josh, Richardson, Stewart, Kou, S. C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5423019/
https://www.ncbi.nlm.nih.gov/pubmed/28482810
http://dx.doi.org/10.1186/s12879-017-2424-7
_version_ 1783234884851990528
author Yang, Shihao
Santillana, Mauricio
Brownstein, John S.
Gray, Josh
Richardson, Stewart
Kou, S. C.
author_facet Yang, Shihao
Santillana, Mauricio
Brownstein, John S.
Gray, Josh
Richardson, Stewart
Kou, S. C.
author_sort Yang, Shihao
collection PubMed
description BACKGROUND: Accurate influenza activity forecasting helps public health officials prepare and allocate resources for unusual influenza activity. Traditional flu surveillance systems, such as the Centers for Disease Control and Prevention’s (CDC) influenza-like illnesses reports, lag behind real-time by one to 2 weeks, whereas information contained in cloud-based electronic health records (EHR) and in Internet users’ search activity is typically available in near real-time. We present a method that combines the information from these two data sources with historical flu activity to produce national flu forecasts for the United States up to 4 weeks ahead of the publication of CDC’s flu reports. METHODS: We extend a method originally designed to track flu using Google searches, named ARGO, to combine information from EHR and Internet searches with historical flu activities. Our regularized multivariate regression model dynamically selects the most appropriate variables for flu prediction every week. The model is assessed for the flu seasons within the time period 2013–2016 using multiple metrics including root mean squared error (RMSE). RESULTS: Our method reduces the RMSE of the publicly available alternative (Healthmap flutrends) method by 33, 20, 17 and 21%, for the four time horizons: real-time, one, two, and 3 weeks ahead, respectively. Such accuracy improvements are statistically significant at the 5% level. Our real-time estimates correctly identified the peak timing and magnitude of the studied flu seasons. CONCLUSIONS: Our method significantly reduces the prediction error when compared to historical publicly available Internet-based prediction systems, demonstrating that: (1) the method to combine data sources is as important as data quality; (2) effectively extracting information from a cloud-based EHR and Internet search activity leads to accurate forecast of flu. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12879-017-2424-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5423019
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-54230192017-05-10 Using electronic health records and Internet search information for accurate influenza forecasting Yang, Shihao Santillana, Mauricio Brownstein, John S. Gray, Josh Richardson, Stewart Kou, S. C. BMC Infect Dis Research Article BACKGROUND: Accurate influenza activity forecasting helps public health officials prepare and allocate resources for unusual influenza activity. Traditional flu surveillance systems, such as the Centers for Disease Control and Prevention’s (CDC) influenza-like illnesses reports, lag behind real-time by one to 2 weeks, whereas information contained in cloud-based electronic health records (EHR) and in Internet users’ search activity is typically available in near real-time. We present a method that combines the information from these two data sources with historical flu activity to produce national flu forecasts for the United States up to 4 weeks ahead of the publication of CDC’s flu reports. METHODS: We extend a method originally designed to track flu using Google searches, named ARGO, to combine information from EHR and Internet searches with historical flu activities. Our regularized multivariate regression model dynamically selects the most appropriate variables for flu prediction every week. The model is assessed for the flu seasons within the time period 2013–2016 using multiple metrics including root mean squared error (RMSE). RESULTS: Our method reduces the RMSE of the publicly available alternative (Healthmap flutrends) method by 33, 20, 17 and 21%, for the four time horizons: real-time, one, two, and 3 weeks ahead, respectively. Such accuracy improvements are statistically significant at the 5% level. Our real-time estimates correctly identified the peak timing and magnitude of the studied flu seasons. CONCLUSIONS: Our method significantly reduces the prediction error when compared to historical publicly available Internet-based prediction systems, demonstrating that: (1) the method to combine data sources is as important as data quality; (2) effectively extracting information from a cloud-based EHR and Internet search activity leads to accurate forecast of flu. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12879-017-2424-7) contains supplementary material, which is available to authorized users. BioMed Central 2017-05-08 /pmc/articles/PMC5423019/ /pubmed/28482810 http://dx.doi.org/10.1186/s12879-017-2424-7 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Yang, Shihao
Santillana, Mauricio
Brownstein, John S.
Gray, Josh
Richardson, Stewart
Kou, S. C.
Using electronic health records and Internet search information for accurate influenza forecasting
title Using electronic health records and Internet search information for accurate influenza forecasting
title_full Using electronic health records and Internet search information for accurate influenza forecasting
title_fullStr Using electronic health records and Internet search information for accurate influenza forecasting
title_full_unstemmed Using electronic health records and Internet search information for accurate influenza forecasting
title_short Using electronic health records and Internet search information for accurate influenza forecasting
title_sort using electronic health records and internet search information for accurate influenza forecasting
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5423019/
https://www.ncbi.nlm.nih.gov/pubmed/28482810
http://dx.doi.org/10.1186/s12879-017-2424-7
work_keys_str_mv AT yangshihao usingelectronichealthrecordsandinternetsearchinformationforaccurateinfluenzaforecasting
AT santillanamauricio usingelectronichealthrecordsandinternetsearchinformationforaccurateinfluenzaforecasting
AT brownsteinjohns usingelectronichealthrecordsandinternetsearchinformationforaccurateinfluenzaforecasting
AT grayjosh usingelectronichealthrecordsandinternetsearchinformationforaccurateinfluenzaforecasting
AT richardsonstewart usingelectronichealthrecordsandinternetsearchinformationforaccurateinfluenzaforecasting
AT kousc usingelectronichealthrecordsandinternetsearchinformationforaccurateinfluenzaforecasting