Cargando…

Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population

BACKGROUND: The gut microbiome has proven to be an important factor affecting obesity; however, it remains a challenge to identify consistent biomarkers across geographic locations and perform precisely targeted modulation for obese individuals. RESULTS: This study proposed a systematic machine lear...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yaoliang, Zhu, Jinlin, Wang, Hongchao, Lu, Wenwei, LEE, Yuan Kun, Zhao, Jianxin, Zhang, Hao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789565/
https://www.ncbi.nlm.nih.gov/pubmed/36564713
http://dx.doi.org/10.1186/s12864-022-09087-2
_version_ 1784858983679393792
author Liu, Yaoliang
Zhu, Jinlin
Wang, Hongchao
Lu, Wenwei
LEE, Yuan Kun
Zhao, Jianxin
Zhang, Hao
author_facet Liu, Yaoliang
Zhu, Jinlin
Wang, Hongchao
Lu, Wenwei
LEE, Yuan Kun
Zhao, Jianxin
Zhang, Hao
author_sort Liu, Yaoliang
collection PubMed
description BACKGROUND: The gut microbiome has proven to be an important factor affecting obesity; however, it remains a challenge to identify consistent biomarkers across geographic locations and perform precisely targeted modulation for obese individuals. RESULTS: This study proposed a systematic machine learning framework and applied it to 870 human stool metagenomes across five countries to obtain comprehensive regional shared biomarkers and conduct a personalized modulation analysis. In our pipeline, a heterogeneous ensemble feature selection diagram is first developed to determine an optimal subset of biomarkers through the aggregation of multiple techniques. Subsequently, a deep reinforcement learning method was established to alter the targeted composition to the desired healthy target. In this manner, we can realize personalized modulation by counterfactual inference. Consequently, a total of 42 species were identified as regional shared biomarkers, and they showed good performance in distinguishing obese people from the healthy group (area under curve (AUC) =0.85) when demonstrated on validation datasets. In addition, by pooling all counterfactual explanations, we found that Akkermansia muciniphila, Faecalibacterium prausnitzii, Prevotella copri, Bacteroides dorei, Bacteroides eggerthii, Alistipes finegoldii, Alistipes shahii, Eubacterium sp. _CAG_180, and Roseburia hominis may be potential broad-spectrum targets with consistent modulation in the multi-regional obese population. CONCLUSIONS: This article shows that based on our proposed machine-learning framework, we can obtain more comprehensive and accurate biomarkers and provide modulation analysis for the obese population. Moreover, our machine-learning framework will also be very useful for other researchers to further obtain biomarkers and perform counterfactual modulation analysis in different diseases.
format Online
Article
Text
id pubmed-9789565
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-97895652022-12-25 Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population Liu, Yaoliang Zhu, Jinlin Wang, Hongchao Lu, Wenwei LEE, Yuan Kun Zhao, Jianxin Zhang, Hao BMC Genomics Research BACKGROUND: The gut microbiome has proven to be an important factor affecting obesity; however, it remains a challenge to identify consistent biomarkers across geographic locations and perform precisely targeted modulation for obese individuals. RESULTS: This study proposed a systematic machine learning framework and applied it to 870 human stool metagenomes across five countries to obtain comprehensive regional shared biomarkers and conduct a personalized modulation analysis. In our pipeline, a heterogeneous ensemble feature selection diagram is first developed to determine an optimal subset of biomarkers through the aggregation of multiple techniques. Subsequently, a deep reinforcement learning method was established to alter the targeted composition to the desired healthy target. In this manner, we can realize personalized modulation by counterfactual inference. Consequently, a total of 42 species were identified as regional shared biomarkers, and they showed good performance in distinguishing obese people from the healthy group (area under curve (AUC) =0.85) when demonstrated on validation datasets. In addition, by pooling all counterfactual explanations, we found that Akkermansia muciniphila, Faecalibacterium prausnitzii, Prevotella copri, Bacteroides dorei, Bacteroides eggerthii, Alistipes finegoldii, Alistipes shahii, Eubacterium sp. _CAG_180, and Roseburia hominis may be potential broad-spectrum targets with consistent modulation in the multi-regional obese population. CONCLUSIONS: This article shows that based on our proposed machine-learning framework, we can obtain more comprehensive and accurate biomarkers and provide modulation analysis for the obese population. Moreover, our machine-learning framework will also be very useful for other researchers to further obtain biomarkers and perform counterfactual modulation analysis in different diseases. BioMed Central 2022-12-23 /pmc/articles/PMC9789565/ /pubmed/36564713 http://dx.doi.org/10.1186/s12864-022-09087-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Liu, Yaoliang
Zhu, Jinlin
Wang, Hongchao
Lu, Wenwei
LEE, Yuan Kun
Zhao, Jianxin
Zhang, Hao
Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
title Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
title_full Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
title_fullStr Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
title_full_unstemmed Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
title_short Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
title_sort machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789565/
https://www.ncbi.nlm.nih.gov/pubmed/36564713
http://dx.doi.org/10.1186/s12864-022-09087-2
work_keys_str_mv AT liuyaoliang machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation
AT zhujinlin machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation
AT wanghongchao machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation
AT luwenwei machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation
AT leeyuankun machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation
AT zhaojianxin machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation
AT zhanghao machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation