Cargando…

Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods

The tremendous boost in next generation sequencing and in the “omics” technologies makes it possible to characterize the human gut microbiome—the collective genomes of the microbial community that reside in our gastrointestinal tract. Although some of these microorganisms are considered to be essent...

Descripción completa

Detalles Bibliográficos
Autores principales: Bakir-Gungor, Burcu, Hacılar, Hilal, Jabeer, Amhar, Nalbantoglu, Ozkan Ufuk, Aran, Oya, Yousef, Malik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9048649/
https://www.ncbi.nlm.nih.gov/pubmed/35497193
http://dx.doi.org/10.7717/peerj.13205
_version_ 1784695976953380864
author Bakir-Gungor, Burcu
Hacılar, Hilal
Jabeer, Amhar
Nalbantoglu, Ozkan Ufuk
Aran, Oya
Yousef, Malik
author_facet Bakir-Gungor, Burcu
Hacılar, Hilal
Jabeer, Amhar
Nalbantoglu, Ozkan Ufuk
Aran, Oya
Yousef, Malik
author_sort Bakir-Gungor, Burcu
collection PubMed
description The tremendous boost in next generation sequencing and in the “omics” technologies makes it possible to characterize the human gut microbiome—the collective genomes of the microbial community that reside in our gastrointestinal tract. Although some of these microorganisms are considered to be essential regulators of our immune system, the alteration of the complexity and eubiotic state of microbiota might promote autoimmune and inflammatory disorders such as diabetes, rheumatoid arthritis, Inflammatory bowel diseases (IBD), obesity, and carcinogenesis. IBD, comprising Crohn’s disease and ulcerative colitis, is a gut-related, multifactorial disease with an unknown etiology. IBD presents defects in the detection and control of the gut microbiota, associated with unbalanced immune reactions, genetic mutations that confer susceptibility to the disease, and complex environmental conditions such as westernized lifestyle. Although some existing studies attempt to unveil the composition and functional capacity of the gut microbiome in relation to IBD diseases, a comprehensive picture of the gut microbiome in IBD patients is far from being complete. Due to the complexity of metagenomic studies, the applications of the state-of-the-art machine learning techniques became popular to address a wide range of questions in the field of metagenomic data analysis. In this regard, using IBD associated metagenomics dataset, this study utilizes both supervised and unsupervised machine learning algorithms, (i) to generate a classification model that aids IBD diagnosis, (ii) to discover IBD-associated biomarkers, (iii) to discover subgroups of IBD patients using k-means and hierarchical clustering approaches. To deal with the high dimensionality of features, we applied robust feature selection algorithms such as Conditional Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF), min redundancy max relevance (mRMR), Select K Best (SKB), Information Gain (IG) and Extreme Gradient Boosting (XGBoost). In our experiments with 100-fold Monte Carlo cross-validation (MCCV), XGBoost, IG, and SKB methods showed a considerable effect in terms of minimizing the microbiota used for the diagnosis of IBD and thus reducing the cost and time. We observed that compared to Decision Tree, Support Vector Machine, Logitboost, Adaboost, and stacking ensemble classifiers, our Random Forest classifier resulted in better performance measures for the classification of IBD. Our findings revealed potential microbiome-mediated mechanisms of IBD and these findings might be useful for the development of microbiome-based diagnostics.
format Online
Article
Text
id pubmed-9048649
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-90486492022-04-29 Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods Bakir-Gungor, Burcu Hacılar, Hilal Jabeer, Amhar Nalbantoglu, Ozkan Ufuk Aran, Oya Yousef, Malik PeerJ Bioinformatics The tremendous boost in next generation sequencing and in the “omics” technologies makes it possible to characterize the human gut microbiome—the collective genomes of the microbial community that reside in our gastrointestinal tract. Although some of these microorganisms are considered to be essential regulators of our immune system, the alteration of the complexity and eubiotic state of microbiota might promote autoimmune and inflammatory disorders such as diabetes, rheumatoid arthritis, Inflammatory bowel diseases (IBD), obesity, and carcinogenesis. IBD, comprising Crohn’s disease and ulcerative colitis, is a gut-related, multifactorial disease with an unknown etiology. IBD presents defects in the detection and control of the gut microbiota, associated with unbalanced immune reactions, genetic mutations that confer susceptibility to the disease, and complex environmental conditions such as westernized lifestyle. Although some existing studies attempt to unveil the composition and functional capacity of the gut microbiome in relation to IBD diseases, a comprehensive picture of the gut microbiome in IBD patients is far from being complete. Due to the complexity of metagenomic studies, the applications of the state-of-the-art machine learning techniques became popular to address a wide range of questions in the field of metagenomic data analysis. In this regard, using IBD associated metagenomics dataset, this study utilizes both supervised and unsupervised machine learning algorithms, (i) to generate a classification model that aids IBD diagnosis, (ii) to discover IBD-associated biomarkers, (iii) to discover subgroups of IBD patients using k-means and hierarchical clustering approaches. To deal with the high dimensionality of features, we applied robust feature selection algorithms such as Conditional Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF), min redundancy max relevance (mRMR), Select K Best (SKB), Information Gain (IG) and Extreme Gradient Boosting (XGBoost). In our experiments with 100-fold Monte Carlo cross-validation (MCCV), XGBoost, IG, and SKB methods showed a considerable effect in terms of minimizing the microbiota used for the diagnosis of IBD and thus reducing the cost and time. We observed that compared to Decision Tree, Support Vector Machine, Logitboost, Adaboost, and stacking ensemble classifiers, our Random Forest classifier resulted in better performance measures for the classification of IBD. Our findings revealed potential microbiome-mediated mechanisms of IBD and these findings might be useful for the development of microbiome-based diagnostics. PeerJ Inc. 2022-04-25 /pmc/articles/PMC9048649/ /pubmed/35497193 http://dx.doi.org/10.7717/peerj.13205 Text en ©2022 Bakir-Gungor et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Bakir-Gungor, Burcu
Hacılar, Hilal
Jabeer, Amhar
Nalbantoglu, Ozkan Ufuk
Aran, Oya
Yousef, Malik
Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods
title Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods
title_full Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods
title_fullStr Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods
title_full_unstemmed Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods
title_short Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods
title_sort inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9048649/
https://www.ncbi.nlm.nih.gov/pubmed/35497193
http://dx.doi.org/10.7717/peerj.13205
work_keys_str_mv AT bakirgungorburcu inflammatoryboweldiseasebiomarkersofhumangutmicrobiotaselectedviadifferentfeatureselectionmethods
AT hacılarhilal inflammatoryboweldiseasebiomarkersofhumangutmicrobiotaselectedviadifferentfeatureselectionmethods
AT jabeeramhar inflammatoryboweldiseasebiomarkersofhumangutmicrobiotaselectedviadifferentfeatureselectionmethods
AT nalbantogluozkanufuk inflammatoryboweldiseasebiomarkersofhumangutmicrobiotaselectedviadifferentfeatureselectionmethods
AT aranoya inflammatoryboweldiseasebiomarkersofhumangutmicrobiotaselectedviadifferentfeatureselectionmethods
AT yousefmalik inflammatoryboweldiseasebiomarkersofhumangutmicrobiotaselectedviadifferentfeatureselectionmethods