Cargando…

GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences

BACKGROUND: Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights i...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Yu-An, Huang, Zhi-An, Li, Jian-Qiang, You, Zhu-Hong, Wang, Lei, Yi, Hai-Cheng, Yu, Chang-Qing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8925046/
https://www.ncbi.nlm.nih.gov/pubmed/35296232
http://dx.doi.org/10.1186/s12864-022-08423-w
_version_ 1784669984659603456
author Huang, Yu-An
Huang, Zhi-An
Li, Jian-Qiang
You, Zhu-Hong
Wang, Lei
Yi, Hai-Cheng
Yu, Chang-Qing
author_facet Huang, Yu-An
Huang, Zhi-An
Li, Jian-Qiang
You, Zhu-Hong
Wang, Lei
Yi, Hai-Cheng
Yu, Chang-Qing
author_sort Huang, Yu-An
collection PubMed
description BACKGROUND: Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights into understanding the pathological mechanism of diseases. However, it is time-consuming and costly to identify the disorder-specific microbes from the biological “haystack” merely by routine wet-lab experiments. With the developments in next-generation sequencing and omics-based trials, it is imperative to develop computational prediction models for predicting microbe-disease associations on a large scale. RESULTS: Based on the known microbe-disease associations derived from the Human Microbe-Disease Association Database (HMDAD), the proposed model shows reliable performance with high values of the area under ROC curve (AUC) of 0.9456 and 0.8866 in leave-one-out cross validations and five-fold cross validations, respectively. In case studies of colorectal carcinoma, 80% out of the top-20 predicted microbes have been experimentally confirmed via published literatures. CONCLUSION: Based on the assumption that functionally similar microbes tend to share the similar interaction patterns with human diseases, we here propose a group based computational model of Bayesian disease-oriented ranking to prioritize the most potential microbes associating with various human diseases. Based on the sequence information of genes, two computational approaches (BLAST+ and MEGA 7) are leveraged to measure the microbe-microbe similarity from different perspectives. The disease-disease similarity is calculated by capturing the hierarchy information from the Medical Subject Headings (MeSH) data. The experimental results illustrate the accuracy and effectiveness of the proposed model. This work is expected to facilitate the characterization and identification of promising microbial biomarkers. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08423-w.
format Online
Article
Text
id pubmed-8925046
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-89250462022-03-23 GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences Huang, Yu-An Huang, Zhi-An Li, Jian-Qiang You, Zhu-Hong Wang, Lei Yi, Hai-Cheng Yu, Chang-Qing BMC Genomics Research BACKGROUND: Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights into understanding the pathological mechanism of diseases. However, it is time-consuming and costly to identify the disorder-specific microbes from the biological “haystack” merely by routine wet-lab experiments. With the developments in next-generation sequencing and omics-based trials, it is imperative to develop computational prediction models for predicting microbe-disease associations on a large scale. RESULTS: Based on the known microbe-disease associations derived from the Human Microbe-Disease Association Database (HMDAD), the proposed model shows reliable performance with high values of the area under ROC curve (AUC) of 0.9456 and 0.8866 in leave-one-out cross validations and five-fold cross validations, respectively. In case studies of colorectal carcinoma, 80% out of the top-20 predicted microbes have been experimentally confirmed via published literatures. CONCLUSION: Based on the assumption that functionally similar microbes tend to share the similar interaction patterns with human diseases, we here propose a group based computational model of Bayesian disease-oriented ranking to prioritize the most potential microbes associating with various human diseases. Based on the sequence information of genes, two computational approaches (BLAST+ and MEGA 7) are leveraged to measure the microbe-microbe similarity from different perspectives. The disease-disease similarity is calculated by capturing the hierarchy information from the Medical Subject Headings (MeSH) data. The experimental results illustrate the accuracy and effectiveness of the proposed model. This work is expected to facilitate the characterization and identification of promising microbial biomarkers. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08423-w. BioMed Central 2022-03-16 /pmc/articles/PMC8925046/ /pubmed/35296232 http://dx.doi.org/10.1186/s12864-022-08423-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Huang, Yu-An
Huang, Zhi-An
Li, Jian-Qiang
You, Zhu-Hong
Wang, Lei
Yi, Hai-Cheng
Yu, Chang-Qing
GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences
title GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences
title_full GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences
title_fullStr GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences
title_full_unstemmed GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences
title_short GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences
title_sort gbdr: a bayesian model for precise prediction of pathogenic microorganisms using 16s rrna gene sequences
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8925046/
https://www.ncbi.nlm.nih.gov/pubmed/35296232
http://dx.doi.org/10.1186/s12864-022-08423-w
work_keys_str_mv AT huangyuan gbdrabayesianmodelforprecisepredictionofpathogenicmicroorganismsusing16srrnagenesequences
AT huangzhian gbdrabayesianmodelforprecisepredictionofpathogenicmicroorganismsusing16srrnagenesequences
AT lijianqiang gbdrabayesianmodelforprecisepredictionofpathogenicmicroorganismsusing16srrnagenesequences
AT youzhuhong gbdrabayesianmodelforprecisepredictionofpathogenicmicroorganismsusing16srrnagenesequences
AT wanglei gbdrabayesianmodelforprecisepredictionofpathogenicmicroorganismsusing16srrnagenesequences
AT yihaicheng gbdrabayesianmodelforprecisepredictionofpathogenicmicroorganismsusing16srrnagenesequences
AT yuchangqing gbdrabayesianmodelforprecisepredictionofpathogenicmicroorganismsusing16srrnagenesequences