Cargando…
Modeling Baseline Shifts in Multivariate Disease Outbreak Detection
OBJECTIVE: Outbreak detection algorithms monitoring only disease-relevant data streams may be prone to false alarms due to baseline shifts. In this paper, we propose a Multinomial-Generalized-Dirichlet (MGD) model to adjust for baseline shifts. INTRODUCTION: Population surges or large events may cau...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
University of Illinois at Chicago Library
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692939/ |
_version_ | 1782274691469148160 |
---|---|
author | Que, Jialan Tsui, Fu-Chiang |
author_facet | Que, Jialan Tsui, Fu-Chiang |
author_sort | Que, Jialan |
collection | PubMed |
description | OBJECTIVE: Outbreak detection algorithms monitoring only disease-relevant data streams may be prone to false alarms due to baseline shifts. In this paper, we propose a Multinomial-Generalized-Dirichlet (MGD) model to adjust for baseline shifts. INTRODUCTION: Population surges or large events may cause shift of data collected by biosurveillance systems [1]. For example, the Cherry Blossom Festival brings hundreds of thousands of people to DC every year, which results in simultaneous elevations in multiple data streams (Fig. 1). In this paper, we propose an MGD model to accommodate the needs of dealing with baseline shifts. METHODS: Existing multivariate algorithms only model disease-relevant data streams (e.g., anti-fever medication sales or patient visits with constitutional syndrome for detection of flu outbreak). On the contrary, we also incorporate a non-disease-relevant data stream as a control factor. We assume that the counts from all data streams follow a Multinomial distribution. Given this distribution, the expected value of the distribution parameter is not subject to change during a baseline shift; however, it has to change in order to model an outbreak. Therefore, this distribution inherently adjusts for the baseline shifts. In addition, we use the generalized Dirichlet (GD) distribution to model the parameter, since GD distribution is one of the conjugate prior of Multinomial [2]. We call this model the Multinomial-Generalized-Dirichlet (MGD) model. RESULTS: We applied MGD model in our previous proposed Rank-Based Spatial Clustering (MRSC) algorithm [3]. We simulated both outbreak cases and baseline shift phenomena. The experiment includes two groups of data sets. The first includes the data sets only injected with outbreak cases, and the second includes the ones with both outbreak cases and baseline shifts. We apply MRSC algorithm and a reference method, the Multivariate Bayesian Scan Statistic (MBSS) algorithm (which only analyzes the disease-relevant data streams) [4], to both data sets. Fig. 2 shows the performance of outbreak detection: the ROC curves and AMOC curves of analyzing the data sets with baseline shifts (solid lines) and without (dashed lines). We can see from Fig. 2 that the performance of MBSS dropped much more significantly than MRSC when analyzing the data sets with baseline shifts. CONCLUSIONS: The MGD model can be a good supplement model used to detect disease outbreaks in order to achieve both better sensitivity and better specificity especially when baseline shifts are present in the data. |
format | Online Article Text |
id | pubmed-3692939 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | University of Illinois at Chicago Library |
record_format | MEDLINE/PubMed |
spelling | pubmed-36929392013-06-26 Modeling Baseline Shifts in Multivariate Disease Outbreak Detection Que, Jialan Tsui, Fu-Chiang Online J Public Health Inform ISDS 2012 Conference Abstracts OBJECTIVE: Outbreak detection algorithms monitoring only disease-relevant data streams may be prone to false alarms due to baseline shifts. In this paper, we propose a Multinomial-Generalized-Dirichlet (MGD) model to adjust for baseline shifts. INTRODUCTION: Population surges or large events may cause shift of data collected by biosurveillance systems [1]. For example, the Cherry Blossom Festival brings hundreds of thousands of people to DC every year, which results in simultaneous elevations in multiple data streams (Fig. 1). In this paper, we propose an MGD model to accommodate the needs of dealing with baseline shifts. METHODS: Existing multivariate algorithms only model disease-relevant data streams (e.g., anti-fever medication sales or patient visits with constitutional syndrome for detection of flu outbreak). On the contrary, we also incorporate a non-disease-relevant data stream as a control factor. We assume that the counts from all data streams follow a Multinomial distribution. Given this distribution, the expected value of the distribution parameter is not subject to change during a baseline shift; however, it has to change in order to model an outbreak. Therefore, this distribution inherently adjusts for the baseline shifts. In addition, we use the generalized Dirichlet (GD) distribution to model the parameter, since GD distribution is one of the conjugate prior of Multinomial [2]. We call this model the Multinomial-Generalized-Dirichlet (MGD) model. RESULTS: We applied MGD model in our previous proposed Rank-Based Spatial Clustering (MRSC) algorithm [3]. We simulated both outbreak cases and baseline shift phenomena. The experiment includes two groups of data sets. The first includes the data sets only injected with outbreak cases, and the second includes the ones with both outbreak cases and baseline shifts. We apply MRSC algorithm and a reference method, the Multivariate Bayesian Scan Statistic (MBSS) algorithm (which only analyzes the disease-relevant data streams) [4], to both data sets. Fig. 2 shows the performance of outbreak detection: the ROC curves and AMOC curves of analyzing the data sets with baseline shifts (solid lines) and without (dashed lines). We can see from Fig. 2 that the performance of MBSS dropped much more significantly than MRSC when analyzing the data sets with baseline shifts. CONCLUSIONS: The MGD model can be a good supplement model used to detect disease outbreaks in order to achieve both better sensitivity and better specificity especially when baseline shifts are present in the data. University of Illinois at Chicago Library 2013-04-04 /pmc/articles/PMC3692939/ Text en ©2013 the author(s) http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/ojphi/about/submissions#copyrightNotice This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes. |
spellingShingle | ISDS 2012 Conference Abstracts Que, Jialan Tsui, Fu-Chiang Modeling Baseline Shifts in Multivariate Disease Outbreak Detection |
title | Modeling Baseline Shifts in Multivariate Disease Outbreak Detection |
title_full | Modeling Baseline Shifts in Multivariate Disease Outbreak Detection |
title_fullStr | Modeling Baseline Shifts in Multivariate Disease Outbreak Detection |
title_full_unstemmed | Modeling Baseline Shifts in Multivariate Disease Outbreak Detection |
title_short | Modeling Baseline Shifts in Multivariate Disease Outbreak Detection |
title_sort | modeling baseline shifts in multivariate disease outbreak detection |
topic | ISDS 2012 Conference Abstracts |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692939/ |
work_keys_str_mv | AT quejialan modelingbaselineshiftsinmultivariatediseaseoutbreakdetection AT tsuifuchiang modelingbaselineshiftsinmultivariatediseaseoutbreakdetection |