Cargando…

Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations

Metagenomic analysis has been explored for disease diagnosis and biomarker discovery. Low sample sizes, high dimensionality, and sparsity of metagenomic data challenge metagenomic investigations. Here, an unsupervised microbial embedding, grouping, and mapping algorithm (MEGMA) was developed to tran...

Descripción completa

Detalles Bibliográficos
Autores principales: Shen, Wan Xiang, Liang, Shu Ran, Jiang, Yu Yang, Chen, Yu Zong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9868677/
https://www.ncbi.nlm.nih.gov/pubmed/36699735
http://dx.doi.org/10.1016/j.patter.2022.100658
_version_ 1784876594088640512
author Shen, Wan Xiang
Liang, Shu Ran
Jiang, Yu Yang
Chen, Yu Zong
author_facet Shen, Wan Xiang
Liang, Shu Ran
Jiang, Yu Yang
Chen, Yu Zong
author_sort Shen, Wan Xiang
collection PubMed
description Metagenomic analysis has been explored for disease diagnosis and biomarker discovery. Low sample sizes, high dimensionality, and sparsity of metagenomic data challenge metagenomic investigations. Here, an unsupervised microbial embedding, grouping, and mapping algorithm (MEGMA) was developed to transform metagenomic data into individualized multichannel microbiome 2D representation by manifold learning and clustering of microbial profiles (e.g., composition, abundance, hierarchy, and taxonomy). These 2D representations enable enhanced disease prediction by established ConvNet-based AggMapNet models, outperforming the commonly used machine learning and deep learning models in metagenomic benchmark datasets. These 2D representations combined with AggMapNet explainable module robustly identified more reliable and replicable disease-prediction microbes (biomarkers). Employing the MEGMA-AggMapNet pipeline for biomarker identification from 5 disease datasets, 84% of the identified biomarkers have been described in over 74 distinct works as important for these diseases. Moreover, the method also discovered highly consistent sets of biomarkers in cross-cohort colorectal cancer (CRC) patients and microbial shifts in different CRC stages.
format Online
Article
Text
id pubmed-9868677
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-98686772023-01-24 Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations Shen, Wan Xiang Liang, Shu Ran Jiang, Yu Yang Chen, Yu Zong Patterns (N Y) Article Metagenomic analysis has been explored for disease diagnosis and biomarker discovery. Low sample sizes, high dimensionality, and sparsity of metagenomic data challenge metagenomic investigations. Here, an unsupervised microbial embedding, grouping, and mapping algorithm (MEGMA) was developed to transform metagenomic data into individualized multichannel microbiome 2D representation by manifold learning and clustering of microbial profiles (e.g., composition, abundance, hierarchy, and taxonomy). These 2D representations enable enhanced disease prediction by established ConvNet-based AggMapNet models, outperforming the commonly used machine learning and deep learning models in metagenomic benchmark datasets. These 2D representations combined with AggMapNet explainable module robustly identified more reliable and replicable disease-prediction microbes (biomarkers). Employing the MEGMA-AggMapNet pipeline for biomarker identification from 5 disease datasets, 84% of the identified biomarkers have been described in over 74 distinct works as important for these diseases. Moreover, the method also discovered highly consistent sets of biomarkers in cross-cohort colorectal cancer (CRC) patients and microbial shifts in different CRC stages. Elsevier 2022-12-15 /pmc/articles/PMC9868677/ /pubmed/36699735 http://dx.doi.org/10.1016/j.patter.2022.100658 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Shen, Wan Xiang
Liang, Shu Ran
Jiang, Yu Yang
Chen, Yu Zong
Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations
title Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations
title_full Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations
title_fullStr Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations
title_full_unstemmed Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations
title_short Enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2D representations
title_sort enhanced metagenomic deep learning for disease prediction and consistent signature recognition by restructured microbiome 2d representations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9868677/
https://www.ncbi.nlm.nih.gov/pubmed/36699735
http://dx.doi.org/10.1016/j.patter.2022.100658
work_keys_str_mv AT shenwanxiang enhancedmetagenomicdeeplearningfordiseasepredictionandconsistentsignaturerecognitionbyrestructuredmicrobiome2drepresentations
AT liangshuran enhancedmetagenomicdeeplearningfordiseasepredictionandconsistentsignaturerecognitionbyrestructuredmicrobiome2drepresentations
AT jiangyuyang enhancedmetagenomicdeeplearningfordiseasepredictionandconsistentsignaturerecognitionbyrestructuredmicrobiome2drepresentations
AT chenyuzong enhancedmetagenomicdeeplearningfordiseasepredictionandconsistentsignaturerecognitionbyrestructuredmicrobiome2drepresentations