Cargando…
An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study
BACKGROUND: Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general p...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501584/ https://www.ncbi.nlm.nih.gov/pubmed/32701458 http://dx.doi.org/10.2196/12842 |
_version_ | 1783584058550255616 |
---|---|
author | Sambaturu, Prathyush Bhattacharya, Parantapa Chen, Jiangzhuo Lewis, Bryan Marathe, Madhav Venkatramanan, Srinivasan Vullikanti, Anil |
author_facet | Sambaturu, Prathyush Bhattacharya, Parantapa Chen, Jiangzhuo Lewis, Bryan Marathe, Madhav Venkatramanan, Srinivasan Vullikanti, Anil |
author_sort | Sambaturu, Prathyush |
collection | PubMed |
description | BACKGROUND: Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general public, are often interested in deeper patterns and insights into how the disease is spreading, with additional context. Analysis by domain experts is needed for deriving such insights from incidence data. OBJECTIVE: Our goal was to develop an automated approach for finding interesting spatio-temporal patterns in the spread of a disease over a large region, such as regions which have specific characteristics (eg, high incidence in a particular week, those which showed a sudden change in incidence) or regions which have significantly different incidence compared to earlier seasons. METHODS: We developed techniques from the area of transactional data mining for characterizing and finding interesting spatio-temporal patterns in disease spread in an automated manner. A key part of our approach involved using the principle of minimum description length for representing a given target set in terms of combinations of attributes (referred to as clauses); we considered both positive and negative clauses, relaxed descriptions which approximately represent the set, and used integer programming to find such descriptions. Finally, we designed an automated approach, which examines a large space of sets corresponding to different spatio-temporal patterns, and ranks them based on the ratio of their size to their description length (referred to as the compression ratio). RESULTS: We applied our methods using minimum description length to find spatio-temporal patterns in the spread of seasonal influenza in the United States using state level influenza-like illness activity indicator data from the CDC. We observed that the compression ratios were over 2.5 for 50% of the chosen sets, when approximate descriptions and negative clauses were allowed. Sets with high compression ratios (eg, over 2.5) corresponded to interesting patterns in the spatio-temporal dynamics of influenza-like illness. Our approach also outperformed description by solution in terms of the compression ratio. CONCLUSIONS: Our approach, which is an unsupervised machine learning method, can provide new insights into patterns and trends in the disease spread in an automated manner. Our results show that the description complexity is an effective approach for characterizing sets of interest, which can be easily extended to other diseases and regions beyond influenza in the US. Our approach can also be easily adapted for automated generation of narratives. |
format | Online Article Text |
id | pubmed-7501584 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-75015842020-09-30 An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study Sambaturu, Prathyush Bhattacharya, Parantapa Chen, Jiangzhuo Lewis, Bryan Marathe, Madhav Venkatramanan, Srinivasan Vullikanti, Anil JMIR Public Health Surveill Original Paper BACKGROUND: Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general public, are often interested in deeper patterns and insights into how the disease is spreading, with additional context. Analysis by domain experts is needed for deriving such insights from incidence data. OBJECTIVE: Our goal was to develop an automated approach for finding interesting spatio-temporal patterns in the spread of a disease over a large region, such as regions which have specific characteristics (eg, high incidence in a particular week, those which showed a sudden change in incidence) or regions which have significantly different incidence compared to earlier seasons. METHODS: We developed techniques from the area of transactional data mining for characterizing and finding interesting spatio-temporal patterns in disease spread in an automated manner. A key part of our approach involved using the principle of minimum description length for representing a given target set in terms of combinations of attributes (referred to as clauses); we considered both positive and negative clauses, relaxed descriptions which approximately represent the set, and used integer programming to find such descriptions. Finally, we designed an automated approach, which examines a large space of sets corresponding to different spatio-temporal patterns, and ranks them based on the ratio of their size to their description length (referred to as the compression ratio). RESULTS: We applied our methods using minimum description length to find spatio-temporal patterns in the spread of seasonal influenza in the United States using state level influenza-like illness activity indicator data from the CDC. We observed that the compression ratios were over 2.5 for 50% of the chosen sets, when approximate descriptions and negative clauses were allowed. Sets with high compression ratios (eg, over 2.5) corresponded to interesting patterns in the spatio-temporal dynamics of influenza-like illness. Our approach also outperformed description by solution in terms of the compression ratio. CONCLUSIONS: Our approach, which is an unsupervised machine learning method, can provide new insights into patterns and trends in the disease spread in an automated manner. Our results show that the description complexity is an effective approach for characterizing sets of interest, which can be easily extended to other diseases and regions beyond influenza in the US. Our approach can also be easily adapted for automated generation of narratives. JMIR Publications 2020-09-04 /pmc/articles/PMC7501584/ /pubmed/32701458 http://dx.doi.org/10.2196/12842 Text en ©Prathyush Sambaturu, Parantapa Bhattacharya, Jiangzhuo Chen, Bryan Lewis, Madhav Marathe, Srinivasan Venkatramanan, Anil Vullikanti. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 04.09.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Sambaturu, Prathyush Bhattacharya, Parantapa Chen, Jiangzhuo Lewis, Bryan Marathe, Madhav Venkatramanan, Srinivasan Vullikanti, Anil An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study |
title | An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study |
title_full | An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study |
title_fullStr | An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study |
title_full_unstemmed | An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study |
title_short | An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study |
title_sort | automated approach for finding spatio-temporal patterns of seasonal influenza in the united states: algorithm validation study |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501584/ https://www.ncbi.nlm.nih.gov/pubmed/32701458 http://dx.doi.org/10.2196/12842 |
work_keys_str_mv | AT sambaturuprathyush anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT bhattacharyaparantapa anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT chenjiangzhuo anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT lewisbryan anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT marathemadhav anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT venkatramanansrinivasan anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT vullikantianil anautomatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT sambaturuprathyush automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT bhattacharyaparantapa automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT chenjiangzhuo automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT lewisbryan automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT marathemadhav automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT venkatramanansrinivasan automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy AT vullikantianil automatedapproachforfindingspatiotemporalpatternsofseasonalinfluenzaintheunitedstatesalgorithmvalidationstudy |