Cargando…
Assessment of vector-host-pathogen relationships using data mining and machine learning
Infectious diseases, including vector-borne diseases transmitted by arthropods, are a leading cause of morbidity and mortality worldwide. In the era of big data, addressing broad-scale, fundamental questions regarding the complex dynamics of these diseases will increasingly require the integration o...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7340972/ https://www.ncbi.nlm.nih.gov/pubmed/32670510 http://dx.doi.org/10.1016/j.csbj.2020.06.031 |
_version_ | 1783555135489703936 |
---|---|
author | Agany, Diing D.M. Pietri, Jose E. Gnimpieba, Etienne Z. |
author_facet | Agany, Diing D.M. Pietri, Jose E. Gnimpieba, Etienne Z. |
author_sort | Agany, Diing D.M. |
collection | PubMed |
description | Infectious diseases, including vector-borne diseases transmitted by arthropods, are a leading cause of morbidity and mortality worldwide. In the era of big data, addressing broad-scale, fundamental questions regarding the complex dynamics of these diseases will increasingly require the integration of diverse datasets to produce new biological knowledge. This review provides a current snapshot of the systematic assessment of the relationships between microbial pathogens, arthropod vectors and mammalian hosts using data mining and machine learning. We employ PRISMA to identify 32 key papers relevant to this topic. Our analysis shows an increasing use of data mining and machine learning tasks and techniques, including prediction, classification, clustering, association rules mining, and deep learning, over the last decade. However, it also reveals a number of critical challenges in applying these to the study of vector-host-pathogen interactions at various systems biology levels. Here, relevant studies, current limitations and future directions are discussed. Furthermore, the quality of data in relevant papers was assessed using the FAIR (Findable, Accessible, Interoperable, Reusable) compliance criteria to evaluate and encourage reproducibility and shareability of research outcomes. Although shortcomings in their application remain, data mining and machine learning have significant potential to break new ground in understanding fundamental aspects of vector-host-pathogen relationships and their application in this field should be encouraged. In particular, while predictive modeling, feature engineering and supervised machine learning are already being used in the field, other data mining and machine learning methods such as deep learning and association rules analysis lag behind and should be implemented in combination with established methods to accelerate hypothesis and knowledge generation in the domain. |
format | Online Article Text |
id | pubmed-7340972 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-73409722020-07-14 Assessment of vector-host-pathogen relationships using data mining and machine learning Agany, Diing D.M. Pietri, Jose E. Gnimpieba, Etienne Z. Comput Struct Biotechnol J Review Article Infectious diseases, including vector-borne diseases transmitted by arthropods, are a leading cause of morbidity and mortality worldwide. In the era of big data, addressing broad-scale, fundamental questions regarding the complex dynamics of these diseases will increasingly require the integration of diverse datasets to produce new biological knowledge. This review provides a current snapshot of the systematic assessment of the relationships between microbial pathogens, arthropod vectors and mammalian hosts using data mining and machine learning. We employ PRISMA to identify 32 key papers relevant to this topic. Our analysis shows an increasing use of data mining and machine learning tasks and techniques, including prediction, classification, clustering, association rules mining, and deep learning, over the last decade. However, it also reveals a number of critical challenges in applying these to the study of vector-host-pathogen interactions at various systems biology levels. Here, relevant studies, current limitations and future directions are discussed. Furthermore, the quality of data in relevant papers was assessed using the FAIR (Findable, Accessible, Interoperable, Reusable) compliance criteria to evaluate and encourage reproducibility and shareability of research outcomes. Although shortcomings in their application remain, data mining and machine learning have significant potential to break new ground in understanding fundamental aspects of vector-host-pathogen relationships and their application in this field should be encouraged. In particular, while predictive modeling, feature engineering and supervised machine learning are already being used in the field, other data mining and machine learning methods such as deep learning and association rules analysis lag behind and should be implemented in combination with established methods to accelerate hypothesis and knowledge generation in the domain. Research Network of Computational and Structural Biotechnology 2020-06-25 /pmc/articles/PMC7340972/ /pubmed/32670510 http://dx.doi.org/10.1016/j.csbj.2020.06.031 Text en http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Review Article Agany, Diing D.M. Pietri, Jose E. Gnimpieba, Etienne Z. Assessment of vector-host-pathogen relationships using data mining and machine learning |
title | Assessment of vector-host-pathogen relationships using data mining and machine learning |
title_full | Assessment of vector-host-pathogen relationships using data mining and machine learning |
title_fullStr | Assessment of vector-host-pathogen relationships using data mining and machine learning |
title_full_unstemmed | Assessment of vector-host-pathogen relationships using data mining and machine learning |
title_short | Assessment of vector-host-pathogen relationships using data mining and machine learning |
title_sort | assessment of vector-host-pathogen relationships using data mining and machine learning |
topic | Review Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7340972/ https://www.ncbi.nlm.nih.gov/pubmed/32670510 http://dx.doi.org/10.1016/j.csbj.2020.06.031 |
work_keys_str_mv | AT aganydiingdm assessmentofvectorhostpathogenrelationshipsusingdataminingandmachinelearning AT pietrijosee assessmentofvectorhostpathogenrelationshipsusingdataminingandmachinelearning AT gnimpiebaetiennez assessmentofvectorhostpathogenrelationshipsusingdataminingandmachinelearning |