Cargando…
Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
BACKGROUND: The gut microbiome has proven to be an important factor affecting obesity; however, it remains a challenge to identify consistent biomarkers across geographic locations and perform precisely targeted modulation for obese individuals. RESULTS: This study proposed a systematic machine lear...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789565/ https://www.ncbi.nlm.nih.gov/pubmed/36564713 http://dx.doi.org/10.1186/s12864-022-09087-2 |
_version_ | 1784858983679393792 |
---|---|
author | Liu, Yaoliang Zhu, Jinlin Wang, Hongchao Lu, Wenwei LEE, Yuan Kun Zhao, Jianxin Zhang, Hao |
author_facet | Liu, Yaoliang Zhu, Jinlin Wang, Hongchao Lu, Wenwei LEE, Yuan Kun Zhao, Jianxin Zhang, Hao |
author_sort | Liu, Yaoliang |
collection | PubMed |
description | BACKGROUND: The gut microbiome has proven to be an important factor affecting obesity; however, it remains a challenge to identify consistent biomarkers across geographic locations and perform precisely targeted modulation for obese individuals. RESULTS: This study proposed a systematic machine learning framework and applied it to 870 human stool metagenomes across five countries to obtain comprehensive regional shared biomarkers and conduct a personalized modulation analysis. In our pipeline, a heterogeneous ensemble feature selection diagram is first developed to determine an optimal subset of biomarkers through the aggregation of multiple techniques. Subsequently, a deep reinforcement learning method was established to alter the targeted composition to the desired healthy target. In this manner, we can realize personalized modulation by counterfactual inference. Consequently, a total of 42 species were identified as regional shared biomarkers, and they showed good performance in distinguishing obese people from the healthy group (area under curve (AUC) =0.85) when demonstrated on validation datasets. In addition, by pooling all counterfactual explanations, we found that Akkermansia muciniphila, Faecalibacterium prausnitzii, Prevotella copri, Bacteroides dorei, Bacteroides eggerthii, Alistipes finegoldii, Alistipes shahii, Eubacterium sp. _CAG_180, and Roseburia hominis may be potential broad-spectrum targets with consistent modulation in the multi-regional obese population. CONCLUSIONS: This article shows that based on our proposed machine-learning framework, we can obtain more comprehensive and accurate biomarkers and provide modulation analysis for the obese population. Moreover, our machine-learning framework will also be very useful for other researchers to further obtain biomarkers and perform counterfactual modulation analysis in different diseases. |
format | Online Article Text |
id | pubmed-9789565 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-97895652022-12-25 Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population Liu, Yaoliang Zhu, Jinlin Wang, Hongchao Lu, Wenwei LEE, Yuan Kun Zhao, Jianxin Zhang, Hao BMC Genomics Research BACKGROUND: The gut microbiome has proven to be an important factor affecting obesity; however, it remains a challenge to identify consistent biomarkers across geographic locations and perform precisely targeted modulation for obese individuals. RESULTS: This study proposed a systematic machine learning framework and applied it to 870 human stool metagenomes across five countries to obtain comprehensive regional shared biomarkers and conduct a personalized modulation analysis. In our pipeline, a heterogeneous ensemble feature selection diagram is first developed to determine an optimal subset of biomarkers through the aggregation of multiple techniques. Subsequently, a deep reinforcement learning method was established to alter the targeted composition to the desired healthy target. In this manner, we can realize personalized modulation by counterfactual inference. Consequently, a total of 42 species were identified as regional shared biomarkers, and they showed good performance in distinguishing obese people from the healthy group (area under curve (AUC) =0.85) when demonstrated on validation datasets. In addition, by pooling all counterfactual explanations, we found that Akkermansia muciniphila, Faecalibacterium prausnitzii, Prevotella copri, Bacteroides dorei, Bacteroides eggerthii, Alistipes finegoldii, Alistipes shahii, Eubacterium sp. _CAG_180, and Roseburia hominis may be potential broad-spectrum targets with consistent modulation in the multi-regional obese population. CONCLUSIONS: This article shows that based on our proposed machine-learning framework, we can obtain more comprehensive and accurate biomarkers and provide modulation analysis for the obese population. Moreover, our machine-learning framework will also be very useful for other researchers to further obtain biomarkers and perform counterfactual modulation analysis in different diseases. BioMed Central 2022-12-23 /pmc/articles/PMC9789565/ /pubmed/36564713 http://dx.doi.org/10.1186/s12864-022-09087-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Liu, Yaoliang Zhu, Jinlin Wang, Hongchao Lu, Wenwei LEE, Yuan Kun Zhao, Jianxin Zhang, Hao Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population |
title | Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population |
title_full | Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population |
title_fullStr | Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population |
title_full_unstemmed | Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population |
title_short | Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population |
title_sort | machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9789565/ https://www.ncbi.nlm.nih.gov/pubmed/36564713 http://dx.doi.org/10.1186/s12864-022-09087-2 |
work_keys_str_mv | AT liuyaoliang machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation AT zhujinlin machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation AT wanghongchao machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation AT luwenwei machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation AT leeyuankun machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation AT zhaojianxin machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation AT zhanghao machinelearningframeworkforgutmicrobiomebiomarkersdiscoveryandmodulationanalysisinlargescaleobesepopulation |