Cargando…
Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods
Anomaly detection methods have a great potential to assist the detection of diseases in animal production systems. We used sequence data of Porcine Reproductive and Respiratory Syndrome (PRRS) to define the emergence of new strains at the farm level. We evaluated the performance of 24 anomaly detect...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492347/ https://www.ncbi.nlm.nih.gov/pubmed/37684632 http://dx.doi.org/10.1186/s13567-023-01197-3 |
_version_ | 1785104234976378880 |
---|---|
author | Díaz-Cao, José Manuel Liu, Xin Kim, Jeonghoon Clavijo, Maria Jose Martínez-López, Beatriz |
author_facet | Díaz-Cao, José Manuel Liu, Xin Kim, Jeonghoon Clavijo, Maria Jose Martínez-López, Beatriz |
author_sort | Díaz-Cao, José Manuel |
collection | PubMed |
description | Anomaly detection methods have a great potential to assist the detection of diseases in animal production systems. We used sequence data of Porcine Reproductive and Respiratory Syndrome (PRRS) to define the emergence of new strains at the farm level. We evaluated the performance of 24 anomaly detection methods based on machine learning, regression, time series techniques and control charts to identify outbreaks in time series of new strains and compared the best methods using different time series: PCR positives, PCR requests and laboratory requests. We introduced synthetic outbreaks of different size and calculated the probability of detection of outbreaks (POD), sensitivity (Se), probability of detection of outbreaks in the first week of appearance (POD1w) and background alarm rate (BAR). The use of time series of new strains from sequence data outperformed the other types of data but POD, Se, POD1w were only high when outbreaks were large. The methods based on Long Short-Term Memory (LSTM) and Bayesian approaches presented the best performance. Using anomaly detection methods with sequence data may help to identify the emergency of cases in multiple farms, but more work is required to improve the detection with time series of high variability. Our results suggest a promising application of sequence data for early detection of diseases at a production system level. This may provide a simple way to extract additional value from routine laboratory analysis. Next steps should include validation of this approach in different settings and with different diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13567-023-01197-3. |
format | Online Article Text |
id | pubmed-10492347 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-104923472023-09-10 Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods Díaz-Cao, José Manuel Liu, Xin Kim, Jeonghoon Clavijo, Maria Jose Martínez-López, Beatriz Vet Res Research Article Anomaly detection methods have a great potential to assist the detection of diseases in animal production systems. We used sequence data of Porcine Reproductive and Respiratory Syndrome (PRRS) to define the emergence of new strains at the farm level. We evaluated the performance of 24 anomaly detection methods based on machine learning, regression, time series techniques and control charts to identify outbreaks in time series of new strains and compared the best methods using different time series: PCR positives, PCR requests and laboratory requests. We introduced synthetic outbreaks of different size and calculated the probability of detection of outbreaks (POD), sensitivity (Se), probability of detection of outbreaks in the first week of appearance (POD1w) and background alarm rate (BAR). The use of time series of new strains from sequence data outperformed the other types of data but POD, Se, POD1w were only high when outbreaks were large. The methods based on Long Short-Term Memory (LSTM) and Bayesian approaches presented the best performance. Using anomaly detection methods with sequence data may help to identify the emergency of cases in multiple farms, but more work is required to improve the detection with time series of high variability. Our results suggest a promising application of sequence data for early detection of diseases at a production system level. This may provide a simple way to extract additional value from routine laboratory analysis. Next steps should include validation of this approach in different settings and with different diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13567-023-01197-3. BioMed Central 2023-09-08 2023 /pmc/articles/PMC10492347/ /pubmed/37684632 http://dx.doi.org/10.1186/s13567-023-01197-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Díaz-Cao, José Manuel Liu, Xin Kim, Jeonghoon Clavijo, Maria Jose Martínez-López, Beatriz Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods |
title | Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods |
title_full | Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods |
title_fullStr | Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods |
title_full_unstemmed | Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods |
title_short | Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods |
title_sort | evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492347/ https://www.ncbi.nlm.nih.gov/pubmed/37684632 http://dx.doi.org/10.1186/s13567-023-01197-3 |
work_keys_str_mv | AT diazcaojosemanuel evaluationoftheapplicationofsequencedatatotheidentificationofoutbreaksofdiseaseusinganomalydetectionmethods AT liuxin evaluationoftheapplicationofsequencedatatotheidentificationofoutbreaksofdiseaseusinganomalydetectionmethods AT kimjeonghoon evaluationoftheapplicationofsequencedatatotheidentificationofoutbreaksofdiseaseusinganomalydetectionmethods AT clavijomariajose evaluationoftheapplicationofsequencedatatotheidentificationofoutbreaksofdiseaseusinganomalydetectionmethods AT martinezlopezbeatriz evaluationoftheapplicationofsequencedatatotheidentificationofoutbreaksofdiseaseusinganomalydetectionmethods |