Cargando…
SAELGMDA: Identifying human microbe–disease associations based on sparse autoencoder and LightGBM
INTRODUCTION: Identification of complex associations between diseases and microbes is important to understand the pathogenesis of diseases and design therapeutic strategies. Biomedical experiment-based Microbe-Disease Association (MDA) detection methods are expensive, time-consuming, and laborious....
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10320730/ https://www.ncbi.nlm.nih.gov/pubmed/37415823 http://dx.doi.org/10.3389/fmicb.2023.1207209 |
_version_ | 1785068497457381376 |
---|---|
author | Wang, Feixiang Yang, Huandong Wu, Yan Peng, Lihong Li, Xiaoling |
author_facet | Wang, Feixiang Yang, Huandong Wu, Yan Peng, Lihong Li, Xiaoling |
author_sort | Wang, Feixiang |
collection | PubMed |
description | INTRODUCTION: Identification of complex associations between diseases and microbes is important to understand the pathogenesis of diseases and design therapeutic strategies. Biomedical experiment-based Microbe-Disease Association (MDA) detection methods are expensive, time-consuming, and laborious. METHODS: Here, we developed a computational method called SAELGMDA for potential MDA prediction. First, microbe similarity and disease similarity are computed by integrating their functional similarity and Gaussian interaction profile kernel similarity. Second, one microbe-disease pair is presented as a feature vector by combining the microbe and disease similarity matrices. Next, the obtained feature vectors are mapped to a low-dimensional space based on a Sparse AutoEncoder. Finally, unknown microbe-disease pairs are classified based on Light Gradient boosting machine. RESULTS: The proposed SAELGMDA method was compared with four state-of-the-art MDA methods (MNNMDA, GATMDA, NTSHMDA, and LRLSHMDA) under five-fold cross validations on diseases, microbes, and microbe-disease pairs on the HMDAD and Disbiome databases. The results show that SAELGMDA computed the best accuracy, Matthews correlation coefficient, AUC, and AUPR under the majority of conditions, outperforming the other four MDA prediction models. In particular, SAELGMDA obtained the best AUCs of 0.8358 and 0.9301 under cross validation on diseases, 0.9838 and 0.9293 under cross validation on microbes, and 0.9857 and 0.9358 under cross validation on microbe-disease pairs on the HMDAD and Disbiome databases. Colorectal cancer, inflammatory bowel disease, and lung cancer are diseases that severely threat human health. We used the proposed SAELGMDA method to find possible microbes for the three diseases. The results demonstrate that there are potential associations between Clostridium coccoides and colorectal cancer and one between Sphingomonadaceae and inflammatory bowel disease. In addition, Veillonella may associate with autism. The inferred MDAs need further validation. CONCLUSION: We anticipate that the proposed SAELGMDA method contributes to the identification of new MDAs. |
format | Online Article Text |
id | pubmed-10320730 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-103207302023-07-06 SAELGMDA: Identifying human microbe–disease associations based on sparse autoencoder and LightGBM Wang, Feixiang Yang, Huandong Wu, Yan Peng, Lihong Li, Xiaoling Front Microbiol Microbiology INTRODUCTION: Identification of complex associations between diseases and microbes is important to understand the pathogenesis of diseases and design therapeutic strategies. Biomedical experiment-based Microbe-Disease Association (MDA) detection methods are expensive, time-consuming, and laborious. METHODS: Here, we developed a computational method called SAELGMDA for potential MDA prediction. First, microbe similarity and disease similarity are computed by integrating their functional similarity and Gaussian interaction profile kernel similarity. Second, one microbe-disease pair is presented as a feature vector by combining the microbe and disease similarity matrices. Next, the obtained feature vectors are mapped to a low-dimensional space based on a Sparse AutoEncoder. Finally, unknown microbe-disease pairs are classified based on Light Gradient boosting machine. RESULTS: The proposed SAELGMDA method was compared with four state-of-the-art MDA methods (MNNMDA, GATMDA, NTSHMDA, and LRLSHMDA) under five-fold cross validations on diseases, microbes, and microbe-disease pairs on the HMDAD and Disbiome databases. The results show that SAELGMDA computed the best accuracy, Matthews correlation coefficient, AUC, and AUPR under the majority of conditions, outperforming the other four MDA prediction models. In particular, SAELGMDA obtained the best AUCs of 0.8358 and 0.9301 under cross validation on diseases, 0.9838 and 0.9293 under cross validation on microbes, and 0.9857 and 0.9358 under cross validation on microbe-disease pairs on the HMDAD and Disbiome databases. Colorectal cancer, inflammatory bowel disease, and lung cancer are diseases that severely threat human health. We used the proposed SAELGMDA method to find possible microbes for the three diseases. The results demonstrate that there are potential associations between Clostridium coccoides and colorectal cancer and one between Sphingomonadaceae and inflammatory bowel disease. In addition, Veillonella may associate with autism. The inferred MDAs need further validation. CONCLUSION: We anticipate that the proposed SAELGMDA method contributes to the identification of new MDAs. Frontiers Media S.A. 2023-06-21 /pmc/articles/PMC10320730/ /pubmed/37415823 http://dx.doi.org/10.3389/fmicb.2023.1207209 Text en Copyright © 2023 Wang, Yang, Wu, Peng and Li. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Microbiology Wang, Feixiang Yang, Huandong Wu, Yan Peng, Lihong Li, Xiaoling SAELGMDA: Identifying human microbe–disease associations based on sparse autoencoder and LightGBM |
title | SAELGMDA: Identifying human microbe–disease associations based on sparse autoencoder and LightGBM |
title_full | SAELGMDA: Identifying human microbe–disease associations based on sparse autoencoder and LightGBM |
title_fullStr | SAELGMDA: Identifying human microbe–disease associations based on sparse autoencoder and LightGBM |
title_full_unstemmed | SAELGMDA: Identifying human microbe–disease associations based on sparse autoencoder and LightGBM |
title_short | SAELGMDA: Identifying human microbe–disease associations based on sparse autoencoder and LightGBM |
title_sort | saelgmda: identifying human microbe–disease associations based on sparse autoencoder and lightgbm |
topic | Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10320730/ https://www.ncbi.nlm.nih.gov/pubmed/37415823 http://dx.doi.org/10.3389/fmicb.2023.1207209 |
work_keys_str_mv | AT wangfeixiang saelgmdaidentifyinghumanmicrobediseaseassociationsbasedonsparseautoencoderandlightgbm AT yanghuandong saelgmdaidentifyinghumanmicrobediseaseassociationsbasedonsparseautoencoderandlightgbm AT wuyan saelgmdaidentifyinghumanmicrobediseaseassociationsbasedonsparseautoencoderandlightgbm AT penglihong saelgmdaidentifyinghumanmicrobediseaseassociationsbasedonsparseautoencoderandlightgbm AT lixiaoling saelgmdaidentifyinghumanmicrobediseaseassociationsbasedonsparseautoencoderandlightgbm |