Cargando…

LightCUD: a program for diagnosing IBD based on human gut microbiome data

BACKGROUND: The diagnosis of inflammatory bowel disease (IBD) and discrimination between the types of IBD are clinically important. IBD is associated with marked changes in the intestinal microbiota. Advances in next-generation sequencing (NGS) technology and the improved hospital bioinformatics ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Congmin, Zhou, Man, Xie, Zhongjie, Li, Mo, Zhu, Xi, Zhu, Huaiqiu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7816363/
https://www.ncbi.nlm.nih.gov/pubmed/33468221
http://dx.doi.org/10.1186/s13040-021-00241-2
_version_ 1783638427103657984
author Xu, Congmin
Zhou, Man
Xie, Zhongjie
Li, Mo
Zhu, Xi
Zhu, Huaiqiu
author_facet Xu, Congmin
Zhou, Man
Xie, Zhongjie
Li, Mo
Zhu, Xi
Zhu, Huaiqiu
author_sort Xu, Congmin
collection PubMed
description BACKGROUND: The diagnosis of inflammatory bowel disease (IBD) and discrimination between the types of IBD are clinically important. IBD is associated with marked changes in the intestinal microbiota. Advances in next-generation sequencing (NGS) technology and the improved hospital bioinformatics analysis ability motivated us to develop a diagnostic method based on the gut microbiome. RESULTS: Using a set of whole-genome sequencing (WGS) data from 349 human gut microbiota samples with two types of IBD and healthy controls, we assembled and aligned WGS short reads to obtain feature profiles of strains and genera. The genus and strain profiles were used for the 16S-based and WGS-based diagnostic modules construction respectively. We designed a novel feature selection procedure to select those case-specific features. With these features, we built discrimination models using different machine learning algorithms. The machine learning algorithm LightGBM outperformed other algorithms in this study and thus was chosen as the core algorithm. Specially, we identified two small sets of biomarkers (strains) separately for the WGS-based health vs IBD module and ulcerative colitis vs Crohn’s disease module, which contributed to the optimization of model performance during pre-training. We released LightCUD as an IBD diagnostic program built with LightGBM. The high performance has been validated through five-fold cross-validation and using an independent test data set. LightCUD was implemented in Python and packaged free for installation with customized databases. With WGS data or 16S rRNA sequencing data of gut microbiome samples as the input, LightCUD can discriminate IBD from healthy controls with high accuracy and further identify the specific type of IBD. The executable program LightCUD was released in open source with instructions at the webpage http://cqb.pku.edu.cn/ZhuLab/LightCUD/. The identified strain biomarkers could be used to study the critical factors for disease development and recommend treatments regarding changes in the gut microbial community. CONCLUSIONS: As the first released human gut microbiome-based IBD diagnostic tool, LightCUD demonstrates a high-performance for both WGS and 16S sequencing data. The strains that either identify healthy controls from IBD patients or distinguish the specific type of IBD are expected to be clinically important to serve as biomarkers. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-021-00241-2.
format Online
Article
Text
id pubmed-7816363
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-78163632021-01-21 LightCUD: a program for diagnosing IBD based on human gut microbiome data Xu, Congmin Zhou, Man Xie, Zhongjie Li, Mo Zhu, Xi Zhu, Huaiqiu BioData Min Research BACKGROUND: The diagnosis of inflammatory bowel disease (IBD) and discrimination between the types of IBD are clinically important. IBD is associated with marked changes in the intestinal microbiota. Advances in next-generation sequencing (NGS) technology and the improved hospital bioinformatics analysis ability motivated us to develop a diagnostic method based on the gut microbiome. RESULTS: Using a set of whole-genome sequencing (WGS) data from 349 human gut microbiota samples with two types of IBD and healthy controls, we assembled and aligned WGS short reads to obtain feature profiles of strains and genera. The genus and strain profiles were used for the 16S-based and WGS-based diagnostic modules construction respectively. We designed a novel feature selection procedure to select those case-specific features. With these features, we built discrimination models using different machine learning algorithms. The machine learning algorithm LightGBM outperformed other algorithms in this study and thus was chosen as the core algorithm. Specially, we identified two small sets of biomarkers (strains) separately for the WGS-based health vs IBD module and ulcerative colitis vs Crohn’s disease module, which contributed to the optimization of model performance during pre-training. We released LightCUD as an IBD diagnostic program built with LightGBM. The high performance has been validated through five-fold cross-validation and using an independent test data set. LightCUD was implemented in Python and packaged free for installation with customized databases. With WGS data or 16S rRNA sequencing data of gut microbiome samples as the input, LightCUD can discriminate IBD from healthy controls with high accuracy and further identify the specific type of IBD. The executable program LightCUD was released in open source with instructions at the webpage http://cqb.pku.edu.cn/ZhuLab/LightCUD/. The identified strain biomarkers could be used to study the critical factors for disease development and recommend treatments regarding changes in the gut microbial community. CONCLUSIONS: As the first released human gut microbiome-based IBD diagnostic tool, LightCUD demonstrates a high-performance for both WGS and 16S sequencing data. The strains that either identify healthy controls from IBD patients or distinguish the specific type of IBD are expected to be clinically important to serve as biomarkers. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-021-00241-2. BioMed Central 2021-01-19 /pmc/articles/PMC7816363/ /pubmed/33468221 http://dx.doi.org/10.1186/s13040-021-00241-2 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Xu, Congmin
Zhou, Man
Xie, Zhongjie
Li, Mo
Zhu, Xi
Zhu, Huaiqiu
LightCUD: a program for diagnosing IBD based on human gut microbiome data
title LightCUD: a program for diagnosing IBD based on human gut microbiome data
title_full LightCUD: a program for diagnosing IBD based on human gut microbiome data
title_fullStr LightCUD: a program for diagnosing IBD based on human gut microbiome data
title_full_unstemmed LightCUD: a program for diagnosing IBD based on human gut microbiome data
title_short LightCUD: a program for diagnosing IBD based on human gut microbiome data
title_sort lightcud: a program for diagnosing ibd based on human gut microbiome data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7816363/
https://www.ncbi.nlm.nih.gov/pubmed/33468221
http://dx.doi.org/10.1186/s13040-021-00241-2
work_keys_str_mv AT xucongmin lightcudaprogramfordiagnosingibdbasedonhumangutmicrobiomedata
AT zhouman lightcudaprogramfordiagnosingibdbasedonhumangutmicrobiomedata
AT xiezhongjie lightcudaprogramfordiagnosingibdbasedonhumangutmicrobiomedata
AT limo lightcudaprogramfordiagnosingibdbasedonhumangutmicrobiomedata
AT zhuxi lightcudaprogramfordiagnosingibdbasedonhumangutmicrobiomedata
AT zhuhuaiqiu lightcudaprogramfordiagnosingibdbasedonhumangutmicrobiomedata