Cargando…

Determinants of Outbreak Detection Performance

OBJECTIVE: To predict the performance of outbreak detection algorithms under different circumstances which will guide the method selection and algorithm configuration in surveillance systems, to characterize the dependence of the performance of detection algorithms on the type and severity of outbre...

Descripción completa

Detalles Bibliográficos
Autores principales: Jafarpour, Nastaran, Precup, Doina, Buckeridge, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: University of Illinois at Chicago Library 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692940/
_version_ 1782274691690397696
author Jafarpour, Nastaran
Precup, Doina
Buckeridge, David
author_facet Jafarpour, Nastaran
Precup, Doina
Buckeridge, David
author_sort Jafarpour, Nastaran
collection PubMed
description OBJECTIVE: To predict the performance of outbreak detection algorithms under different circumstances which will guide the method selection and algorithm configuration in surveillance systems, to characterize the dependence of the performance of detection algorithms on the type and severity of outbreak, to develop quantitative evidence about determinants of detection performance. INTRODUCTION: The choice of outbreak detection algorithm and its configuration can result in important variations in the performance of public health surveillance systems. Our work aims to characterize the performance of detectors based on outbreak types. We are using Bayesian networks (BN) to model the relationships between determinants of outbreak detection and the detection performance based on a significant study on simulated data. METHODS: The simulated surveillance data that we used was generated by Surveillance Lab of McGill University using Simulation Analysis Platform [1] considering surveillance in an urban area to detect waterborne outbreaks due to the failure of a water treatment plant. We focus on predicting the performance of the C-family of algorithms, because they are widely used, state-of-art outbreak detection algorithms [2]. We investigate the influence of algorithm characteristics and outbreak characteristics in determining outbreak detection performance. The C1, C2, and C3 are distinguished by the configuration of 2 parameters,the guardband and memory. Generally, gradually increasing outbreaks can bias the test statistic upward, so the detection algorithm will fail to flag the outbreak. To avoid this situation, the C2 and C3 use a 2-day gap, guardband, between the baseline interval and the test interval. The C3 includes 2 recent observations, called memory, in the computation of the test statistic. The W2 algorithm is a modified version of the C2 which takes weekly patterns of surveillance time series into account [3]. In the W2, the baseline data is stratified to 2 distinct baselines: one for weekdays, the other for weekends. The W3 includes 2 recent observations of each baseline while calculating the test statistic in the corresponding baseline. We ran the C1, C2, C3, W2, and W3 on 18k simulated time series and measured the sensitivity and specificity of detection. Then we created the training data set of 5400000 instances. Each instance was the result of performance evaluation of an outbreak detection algorithm with a specific setting of parameters. In order to investigate the determinants of detection performance and reveal their effects quantitatively, we used BN to predict the performance based on algorithm characteristics and outbreak characteristics. RESULTS: We developed 2 BN models in the Weka machine learning software [4] using 5-fold cross-validation. The first BN determines the effect of the guardband, memory, alerting threshold, and the weekly pattern indicator (0 for C-algorithms, 1 for W-algorithms) and outbreak characteristics (contamination level and duration) on the sensitivity of detection. The value of sensitivity was mapped to 4 classes: (0, 0.3], (0.3, 0.6], (0.6, 0.9], (0.9, 1]. The developed BN correctly classified 67.74% of instances. The misclassification error was 0.9407. The second BN for predicting the specificity of detection correctly classified 95.895% of instances in 10 classes and the misclassification error was 0.2975. CONCLUSIONS: The contamination level and duration of outbreaks, alerting threshold, memory, guardband, and whether the weekly pattern was considered or not influence the sensitivity and specificity of outbreak detection and given the C-algorithm parameter settings, we can predict outbreak detection performance quantitatively. In future work, we plan to investigate other predictors of performance and study how these predictions can be used in algorithm and policy choices.
format Online
Article
Text
id pubmed-3692940
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher University of Illinois at Chicago Library
record_format MEDLINE/PubMed
spelling pubmed-36929402013-06-26 Determinants of Outbreak Detection Performance Jafarpour, Nastaran Precup, Doina Buckeridge, David Online J Public Health Inform ISDS 2012 Conference Abstracts OBJECTIVE: To predict the performance of outbreak detection algorithms under different circumstances which will guide the method selection and algorithm configuration in surveillance systems, to characterize the dependence of the performance of detection algorithms on the type and severity of outbreak, to develop quantitative evidence about determinants of detection performance. INTRODUCTION: The choice of outbreak detection algorithm and its configuration can result in important variations in the performance of public health surveillance systems. Our work aims to characterize the performance of detectors based on outbreak types. We are using Bayesian networks (BN) to model the relationships between determinants of outbreak detection and the detection performance based on a significant study on simulated data. METHODS: The simulated surveillance data that we used was generated by Surveillance Lab of McGill University using Simulation Analysis Platform [1] considering surveillance in an urban area to detect waterborne outbreaks due to the failure of a water treatment plant. We focus on predicting the performance of the C-family of algorithms, because they are widely used, state-of-art outbreak detection algorithms [2]. We investigate the influence of algorithm characteristics and outbreak characteristics in determining outbreak detection performance. The C1, C2, and C3 are distinguished by the configuration of 2 parameters,the guardband and memory. Generally, gradually increasing outbreaks can bias the test statistic upward, so the detection algorithm will fail to flag the outbreak. To avoid this situation, the C2 and C3 use a 2-day gap, guardband, between the baseline interval and the test interval. The C3 includes 2 recent observations, called memory, in the computation of the test statistic. The W2 algorithm is a modified version of the C2 which takes weekly patterns of surveillance time series into account [3]. In the W2, the baseline data is stratified to 2 distinct baselines: one for weekdays, the other for weekends. The W3 includes 2 recent observations of each baseline while calculating the test statistic in the corresponding baseline. We ran the C1, C2, C3, W2, and W3 on 18k simulated time series and measured the sensitivity and specificity of detection. Then we created the training data set of 5400000 instances. Each instance was the result of performance evaluation of an outbreak detection algorithm with a specific setting of parameters. In order to investigate the determinants of detection performance and reveal their effects quantitatively, we used BN to predict the performance based on algorithm characteristics and outbreak characteristics. RESULTS: We developed 2 BN models in the Weka machine learning software [4] using 5-fold cross-validation. The first BN determines the effect of the guardband, memory, alerting threshold, and the weekly pattern indicator (0 for C-algorithms, 1 for W-algorithms) and outbreak characteristics (contamination level and duration) on the sensitivity of detection. The value of sensitivity was mapped to 4 classes: (0, 0.3], (0.3, 0.6], (0.6, 0.9], (0.9, 1]. The developed BN correctly classified 67.74% of instances. The misclassification error was 0.9407. The second BN for predicting the specificity of detection correctly classified 95.895% of instances in 10 classes and the misclassification error was 0.2975. CONCLUSIONS: The contamination level and duration of outbreaks, alerting threshold, memory, guardband, and whether the weekly pattern was considered or not influence the sensitivity and specificity of outbreak detection and given the C-algorithm parameter settings, we can predict outbreak detection performance quantitatively. In future work, we plan to investigate other predictors of performance and study how these predictions can be used in algorithm and policy choices. University of Illinois at Chicago Library 2013-04-04 /pmc/articles/PMC3692940/ Text en ©2013 the author(s) http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/ojphi/about/submissions#copyrightNotice This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes.
spellingShingle ISDS 2012 Conference Abstracts
Jafarpour, Nastaran
Precup, Doina
Buckeridge, David
Determinants of Outbreak Detection Performance
title Determinants of Outbreak Detection Performance
title_full Determinants of Outbreak Detection Performance
title_fullStr Determinants of Outbreak Detection Performance
title_full_unstemmed Determinants of Outbreak Detection Performance
title_short Determinants of Outbreak Detection Performance
title_sort determinants of outbreak detection performance
topic ISDS 2012 Conference Abstracts
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692940/
work_keys_str_mv AT jafarpournastaran determinantsofoutbreakdetectionperformance
AT precupdoina determinantsofoutbreakdetectionperformance
AT buckeridgedavid determinantsofoutbreakdetectionperformance