Cargando…

Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques

Disease surveillance is used to monitor ongoing control activities, detect early outbreaks, and inform intervention priorities and policies. However, data from disease surveillance that could be used to support real-time decisionmaking remain largely underutilised. Using the Brazilian Amazon malaria...

Descripción completa

Detalles Bibliográficos
Autores principales: Eze, Peter U., Geard, Nicholas, Mueller, Ivo, Chades, Iadine
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10341307/
https://www.ncbi.nlm.nih.gov/pubmed/37444730
http://dx.doi.org/10.3390/healthcare11131896
_version_ 1785072231304396800
author Eze, Peter U.
Geard, Nicholas
Mueller, Ivo
Chades, Iadine
author_facet Eze, Peter U.
Geard, Nicholas
Mueller, Ivo
Chades, Iadine
author_sort Eze, Peter U.
collection PubMed
description Disease surveillance is used to monitor ongoing control activities, detect early outbreaks, and inform intervention priorities and policies. However, data from disease surveillance that could be used to support real-time decisionmaking remain largely underutilised. Using the Brazilian Amazon malaria surveillance dataset as a case study, in this paper we explore the potential for unsupervised anomaly detection machine learning techniques to discover signals of epidemiological interest. We found that our models were able to provide an early indication of outbreak onset, outbreak peaks, and change points in the proportion of positive malaria cases. Specifically, the sustained rise in malaria in the Brazilian Amazon in 2016 was flagged by several models. We found that no single model detected all anomalies across all health regions. Because of this, we provide the minimum number of machine learning models top-k models) to maximise the number of anomalies detected across different health regions. We discovered that the top three models that maximise the coverage of the number and types of anomalies detected across the thirteen health regions are principal component analysis, stochastic outlier selection, and the minimum covariance determinant. Anomaly detection is a potentially valuable approach to discovering patterns of epidemiological importance when confronted with a large volume of data across space and time. Our exploratory approach can be replicated for other diseases and locations to inform monitoring, timely interventions, and actions towards the goal of controlling endemic disease.
format Online
Article
Text
id pubmed-10341307
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-103413072023-07-14 Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques Eze, Peter U. Geard, Nicholas Mueller, Ivo Chades, Iadine Healthcare (Basel) Article Disease surveillance is used to monitor ongoing control activities, detect early outbreaks, and inform intervention priorities and policies. However, data from disease surveillance that could be used to support real-time decisionmaking remain largely underutilised. Using the Brazilian Amazon malaria surveillance dataset as a case study, in this paper we explore the potential for unsupervised anomaly detection machine learning techniques to discover signals of epidemiological interest. We found that our models were able to provide an early indication of outbreak onset, outbreak peaks, and change points in the proportion of positive malaria cases. Specifically, the sustained rise in malaria in the Brazilian Amazon in 2016 was flagged by several models. We found that no single model detected all anomalies across all health regions. Because of this, we provide the minimum number of machine learning models top-k models) to maximise the number of anomalies detected across different health regions. We discovered that the top three models that maximise the coverage of the number and types of anomalies detected across the thirteen health regions are principal component analysis, stochastic outlier selection, and the minimum covariance determinant. Anomaly detection is a potentially valuable approach to discovering patterns of epidemiological importance when confronted with a large volume of data across space and time. Our exploratory approach can be replicated for other diseases and locations to inform monitoring, timely interventions, and actions towards the goal of controlling endemic disease. MDPI 2023-06-30 /pmc/articles/PMC10341307/ /pubmed/37444730 http://dx.doi.org/10.3390/healthcare11131896 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Eze, Peter U.
Geard, Nicholas
Mueller, Ivo
Chades, Iadine
Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques
title Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques
title_full Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques
title_fullStr Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques
title_full_unstemmed Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques
title_short Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques
title_sort anomaly detection in endemic disease surveillance data using machine learning techniques
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10341307/
https://www.ncbi.nlm.nih.gov/pubmed/37444730
http://dx.doi.org/10.3390/healthcare11131896
work_keys_str_mv AT ezepeteru anomalydetectioninendemicdiseasesurveillancedatausingmachinelearningtechniques
AT geardnicholas anomalydetectioninendemicdiseasesurveillancedatausingmachinelearningtechniques
AT muellerivo anomalydetectioninendemicdiseasesurveillancedatausingmachinelearningtechniques
AT chadesiadine anomalydetectioninendemicdiseasesurveillancedatausingmachinelearningtechniques