Cargando…

Determinants of Outbreak Detection Performance

OBJECTIVE: To predict the performance of outbreak detection algorithms under different circumstances which will guide the method selection and algorithm configuration in surveillance systems, to characterize the dependence of the performance of detection algorithms on the type and severity of outbre...

Descripción completa

Detalles Bibliográficos
Autores principales: Jafarpour, Nastaran, Precup, Doina, Buckeridge, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: University of Illinois at Chicago Library 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692940/
Descripción
Sumario:OBJECTIVE: To predict the performance of outbreak detection algorithms under different circumstances which will guide the method selection and algorithm configuration in surveillance systems, to characterize the dependence of the performance of detection algorithms on the type and severity of outbreak, to develop quantitative evidence about determinants of detection performance. INTRODUCTION: The choice of outbreak detection algorithm and its configuration can result in important variations in the performance of public health surveillance systems. Our work aims to characterize the performance of detectors based on outbreak types. We are using Bayesian networks (BN) to model the relationships between determinants of outbreak detection and the detection performance based on a significant study on simulated data. METHODS: The simulated surveillance data that we used was generated by Surveillance Lab of McGill University using Simulation Analysis Platform [1] considering surveillance in an urban area to detect waterborne outbreaks due to the failure of a water treatment plant. We focus on predicting the performance of the C-family of algorithms, because they are widely used, state-of-art outbreak detection algorithms [2]. We investigate the influence of algorithm characteristics and outbreak characteristics in determining outbreak detection performance. The C1, C2, and C3 are distinguished by the configuration of 2 parameters,the guardband and memory. Generally, gradually increasing outbreaks can bias the test statistic upward, so the detection algorithm will fail to flag the outbreak. To avoid this situation, the C2 and C3 use a 2-day gap, guardband, between the baseline interval and the test interval. The C3 includes 2 recent observations, called memory, in the computation of the test statistic. The W2 algorithm is a modified version of the C2 which takes weekly patterns of surveillance time series into account [3]. In the W2, the baseline data is stratified to 2 distinct baselines: one for weekdays, the other for weekends. The W3 includes 2 recent observations of each baseline while calculating the test statistic in the corresponding baseline. We ran the C1, C2, C3, W2, and W3 on 18k simulated time series and measured the sensitivity and specificity of detection. Then we created the training data set of 5400000 instances. Each instance was the result of performance evaluation of an outbreak detection algorithm with a specific setting of parameters. In order to investigate the determinants of detection performance and reveal their effects quantitatively, we used BN to predict the performance based on algorithm characteristics and outbreak characteristics. RESULTS: We developed 2 BN models in the Weka machine learning software [4] using 5-fold cross-validation. The first BN determines the effect of the guardband, memory, alerting threshold, and the weekly pattern indicator (0 for C-algorithms, 1 for W-algorithms) and outbreak characteristics (contamination level and duration) on the sensitivity of detection. The value of sensitivity was mapped to 4 classes: (0, 0.3], (0.3, 0.6], (0.6, 0.9], (0.9, 1]. The developed BN correctly classified 67.74% of instances. The misclassification error was 0.9407. The second BN for predicting the specificity of detection correctly classified 95.895% of instances in 10 classes and the misclassification error was 0.2975. CONCLUSIONS: The contamination level and duration of outbreaks, alerting threshold, memory, guardband, and whether the weekly pattern was considered or not influence the sensitivity and specificity of outbreak detection and given the C-algorithm parameter settings, we can predict outbreak detection performance quantitatively. In future work, we plan to investigate other predictors of performance and study how these predictions can be used in algorithm and policy choices.