Cargando…

Assessment of vector-host-pathogen relationships using data mining and machine learning

Infectious diseases, including vector-borne diseases transmitted by arthropods, are a leading cause of morbidity and mortality worldwide. In the era of big data, addressing broad-scale, fundamental questions regarding the complex dynamics of these diseases will increasingly require the integration o...

Descripción completa

Detalles Bibliográficos
Autores principales: Agany, Diing D.M., Pietri, Jose E., Gnimpieba, Etienne Z.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7340972/
https://www.ncbi.nlm.nih.gov/pubmed/32670510
http://dx.doi.org/10.1016/j.csbj.2020.06.031
_version_ 1783555135489703936
author Agany, Diing D.M.
Pietri, Jose E.
Gnimpieba, Etienne Z.
author_facet Agany, Diing D.M.
Pietri, Jose E.
Gnimpieba, Etienne Z.
author_sort Agany, Diing D.M.
collection PubMed
description Infectious diseases, including vector-borne diseases transmitted by arthropods, are a leading cause of morbidity and mortality worldwide. In the era of big data, addressing broad-scale, fundamental questions regarding the complex dynamics of these diseases will increasingly require the integration of diverse datasets to produce new biological knowledge. This review provides a current snapshot of the systematic assessment of the relationships between microbial pathogens, arthropod vectors and mammalian hosts using data mining and machine learning. We employ PRISMA to identify 32 key papers relevant to this topic. Our analysis shows an increasing use of data mining and machine learning tasks and techniques, including prediction, classification, clustering, association rules mining, and deep learning, over the last decade. However, it also reveals a number of critical challenges in applying these to the study of vector-host-pathogen interactions at various systems biology levels. Here, relevant studies, current limitations and future directions are discussed. Furthermore, the quality of data in relevant papers was assessed using the FAIR (Findable, Accessible, Interoperable, Reusable) compliance criteria to evaluate and encourage reproducibility and shareability of research outcomes. Although shortcomings in their application remain, data mining and machine learning have significant potential to break new ground in understanding fundamental aspects of vector-host-pathogen relationships and their application in this field should be encouraged. In particular, while predictive modeling, feature engineering and supervised machine learning are already being used in the field, other data mining and machine learning methods such as deep learning and association rules analysis lag behind and should be implemented in combination with established methods to accelerate hypothesis and knowledge generation in the domain.
format Online
Article
Text
id pubmed-7340972
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-73409722020-07-14 Assessment of vector-host-pathogen relationships using data mining and machine learning Agany, Diing D.M. Pietri, Jose E. Gnimpieba, Etienne Z. Comput Struct Biotechnol J Review Article Infectious diseases, including vector-borne diseases transmitted by arthropods, are a leading cause of morbidity and mortality worldwide. In the era of big data, addressing broad-scale, fundamental questions regarding the complex dynamics of these diseases will increasingly require the integration of diverse datasets to produce new biological knowledge. This review provides a current snapshot of the systematic assessment of the relationships between microbial pathogens, arthropod vectors and mammalian hosts using data mining and machine learning. We employ PRISMA to identify 32 key papers relevant to this topic. Our analysis shows an increasing use of data mining and machine learning tasks and techniques, including prediction, classification, clustering, association rules mining, and deep learning, over the last decade. However, it also reveals a number of critical challenges in applying these to the study of vector-host-pathogen interactions at various systems biology levels. Here, relevant studies, current limitations and future directions are discussed. Furthermore, the quality of data in relevant papers was assessed using the FAIR (Findable, Accessible, Interoperable, Reusable) compliance criteria to evaluate and encourage reproducibility and shareability of research outcomes. Although shortcomings in their application remain, data mining and machine learning have significant potential to break new ground in understanding fundamental aspects of vector-host-pathogen relationships and their application in this field should be encouraged. In particular, while predictive modeling, feature engineering and supervised machine learning are already being used in the field, other data mining and machine learning methods such as deep learning and association rules analysis lag behind and should be implemented in combination with established methods to accelerate hypothesis and knowledge generation in the domain. Research Network of Computational and Structural Biotechnology 2020-06-25 /pmc/articles/PMC7340972/ /pubmed/32670510 http://dx.doi.org/10.1016/j.csbj.2020.06.031 Text en http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Review Article
Agany, Diing D.M.
Pietri, Jose E.
Gnimpieba, Etienne Z.
Assessment of vector-host-pathogen relationships using data mining and machine learning
title Assessment of vector-host-pathogen relationships using data mining and machine learning
title_full Assessment of vector-host-pathogen relationships using data mining and machine learning
title_fullStr Assessment of vector-host-pathogen relationships using data mining and machine learning
title_full_unstemmed Assessment of vector-host-pathogen relationships using data mining and machine learning
title_short Assessment of vector-host-pathogen relationships using data mining and machine learning
title_sort assessment of vector-host-pathogen relationships using data mining and machine learning
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7340972/
https://www.ncbi.nlm.nih.gov/pubmed/32670510
http://dx.doi.org/10.1016/j.csbj.2020.06.031
work_keys_str_mv AT aganydiingdm assessmentofvectorhostpathogenrelationshipsusingdataminingandmachinelearning
AT pietrijosee assessmentofvectorhostpathogenrelationshipsusingdataminingandmachinelearning
AT gnimpiebaetiennez assessmentofvectorhostpathogenrelationshipsusingdataminingandmachinelearning