Cargando…

An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study

BACKGROUND: Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general p...

Descripción completa

Detalles Bibliográficos
Autores principales: Sambaturu, Prathyush, Bhattacharya, Parantapa, Chen, Jiangzhuo, Lewis, Bryan, Marathe, Madhav, Venkatramanan, Srinivasan, Vullikanti, Anil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501584/
https://www.ncbi.nlm.nih.gov/pubmed/32701458
http://dx.doi.org/10.2196/12842
_version_ 1783584058550255616
author Sambaturu, Prathyush
Bhattacharya, Parantapa
Chen, Jiangzhuo
Lewis, Bryan
Marathe, Madhav
Venkatramanan, Srinivasan
Vullikanti, Anil
author_facet Sambaturu, Prathyush
Bhattacharya, Parantapa
Chen, Jiangzhuo
Lewis, Bryan
Marathe, Madhav
Venkatramanan, Srinivasan
Vullikanti, Anil
author_sort Sambaturu, Prathyush
collection PubMed
description BACKGROUND: Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general public, are often interested in deeper patterns and insights into how the disease is spreading, with additional context. Analysis by domain experts is needed for deriving such insights from incidence data. OBJECTIVE: Our goal was to develop an automated approach for finding interesting spatio-temporal patterns in the spread of a disease over a large region, such as regions which have specific characteristics (eg, high incidence in a particular week, those which showed a sudden change in incidence) or regions which have significantly different incidence compared to earlier seasons. METHODS: We developed techniques from the area of transactional data mining for characterizing and finding interesting spatio-temporal patterns in disease spread in an automated manner. A key part of our approach involved using the principle of minimum description length for representing a given target set in terms of combinations of attributes (referred to as clauses); we considered both positive and negative clauses, relaxed descriptions which approximately represent the set, and used integer programming to find such descriptions. Finally, we designed an automated approach, which examines a large space of sets corresponding to different spatio-temporal patterns, and ranks them based on the ratio of their size to their description length (referred to as the compression ratio). RESULTS: We applied our methods using minimum description length to find spatio-temporal patterns in the spread of seasonal influenza in the United States using state level influenza-like illness activity indicator data from the CDC. We observed that the compression ratios were over 2.5 for 50% of the chosen sets, when approximate descriptions and negative clauses were allowed. Sets with high compression ratios (eg, over 2.5) corresponded to interesting patterns in the spatio-temporal dynamics of influenza-like illness. Our approach also outperformed description by solution in terms of the compression ratio. CONCLUSIONS: Our approach, which is an unsupervised machine learning method, can provide new insights into patterns and trends in the disease spread in an automated manner. Our results show that the description complexity is an effective approach for characterizing sets of interest, which can be easily extended to other diseases and regions beyond influenza in the US. Our approach can also be easily adapted for automated generation of narratives.
format Online
Article
Text
id pubmed-7501584
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-75015842020-09-30 An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study Sambaturu, Prathyush Bhattacharya, Parantapa Chen, Jiangzhuo Lewis, Bryan Marathe, Madhav Venkatramanan, Srinivasan Vullikanti, Anil JMIR Public Health Surveill Original Paper BACKGROUND: Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general public, are often interested in deeper patterns and insights into how the disease is spreading, with additional context. Analysis by domain experts is needed for deriving such insights from incidence data. OBJECTIVE: Our goal was to develop an automated approach for finding interesting spatio-temporal patterns in the spread of a disease over a large region, such as regions which have specific characteristics (eg, high incidence in a particular week, those which showed a sudden change in incidence) or regions which have significantly different incidence compared to earlier seasons. METHODS: We developed techniques from the area of transactional data mining for characterizing and finding interesting spatio-temporal patterns in disease spread in an automated manner. A key part of our approach involved using the principle of minimum description length for representing a given target set in terms of combinations of attributes (referred to as clauses); we considered both positive and negative clauses, relaxed descriptions which approximately represent the set, and used integer programming to find such descriptions. Finally, we designed an automated approach, which examines a large space of sets corresponding to different spatio-temporal patterns, and ranks them based on the ratio of their size to their description length (referred to as the compression ratio). RESULTS: We applied our methods using minimum description length to find spatio-temporal patterns in the spread of seasonal influenza in the United States using state level influenza-like illness activity indicator data from the CDC. We observed that the compression ratios were over 2.5 for 50% of the chosen sets, when approximate descriptions and negative clauses were allowed. Sets with high compression ratios (eg, over 2.5) corresponded to interesting patterns in the spatio-temporal dynamics of influenza-like illness. Our approach also outperformed description by solution in terms of the compression ratio. CONCLUSIONS: Our approach, which is an unsupervised machine learning method, can provide new insights into patterns and trends in the disease spread in an automated manner. Our results show that the description complexity is an effective approach for characterizing sets of interest, which can be easily extended to other diseases and regions beyond influenza in the US. Our approach can also be easily adapted for automated generation of narratives. JMIR Publications 2020-09-04 /pmc/articles/PMC7501584/ /pubmed/32701458 http://dx.doi.org/10.2196/12842 Text en ©Prathyush Sambaturu, Parantapa Bhattacharya, Jiangzhuo Chen, Bryan Lewis, Madhav Marathe, Srinivasan Venkatramanan, Anil Vullikanti. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 04.09.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Sambaturu, Prathyush
Bhattacharya, Parantapa
Chen, Jiangzhuo
Lewis, Bryan
Marathe, Madhav
Venkatramanan, Srinivasan
Vullikanti, Anil
An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study
title An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study
title_full An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study
title_fullStr An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study
title_full_unstemmed An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study
title_short An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study
title_sort automated approach for finding spatio-temporal patterns of seasonal influenza in the united states: algorithm validation study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501584/
https://www.ncbi.nlm.nih.gov/pubmed/32701458
http://dx.doi.org/10.2196/12842
work_keys_str_mv AT sambaturuprathyush anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT bhattacharyaparantapa anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT chenjiangzhuo anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT lewisbryan anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT marathemadhav anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT venkatramanansrinivasan anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT vullikantianil anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT sambaturuprathyush automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT bhattacharyaparantapa automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT chenjiangzhuo automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT lewisbryan automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT marathemadhav automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT venkatramanansrinivasan automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy
AT vullikantianil automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy