Cargando…

A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset

BACKGROUND: To identify operational taxonomy units (OTUs) signaling disease onset in an observational study, a powerful strategy was selecting participants by matched sets and profiling temporal metagenomes, followed by trajectory analysis. Existing trajectory analyses modeled individual OTU or micr...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Qian, Vehik, Kendra, Li, Cai, Triplett, Eric, Roesch, Luiz, Hu, Yi-Juan, Krischer, Jeffrey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484160/
https://www.ncbi.nlm.nih.gov/pubmed/36123651
http://dx.doi.org/10.1186/s12864-022-08890-1
_version_ 1784791825916100608
author Li, Qian
Vehik, Kendra
Li, Cai
Triplett, Eric
Roesch, Luiz
Hu, Yi-Juan
Krischer, Jeffrey
author_facet Li, Qian
Vehik, Kendra
Li, Cai
Triplett, Eric
Roesch, Luiz
Hu, Yi-Juan
Krischer, Jeffrey
author_sort Li, Qian
collection PubMed
description BACKGROUND: To identify operational taxonomy units (OTUs) signaling disease onset in an observational study, a powerful strategy was selecting participants by matched sets and profiling temporal metagenomes, followed by trajectory analysis. Existing trajectory analyses modeled individual OTU or microbial community without adjusting for the within-community correlation and matched-set-specific latent factors. RESULTS: We proposed a joint model with matching and regularization (JMR) to detect OTU-specific trajectory predictive of host disease status. The between- and within-matched-sets heterogeneity in OTU relative abundance and disease risk were modeled by nested random effects. The inherent negative correlation in microbiota composition was adjusted by incorporating and regularizing the top-correlated taxa as longitudinal covariate, pre-selected by Bray-Curtis distance and elastic net regression. We designed a simulation pipeline to generate true biomarkers for disease onset and the pseudo biomarkers caused by compositionality. We demonstrated that JMR effectively controlled the false discovery and pseudo biomarkers in a simulation study generating temporal high-dimensional metagenomic counts with random intercept or slope. Application of the competing methods in the simulated data and the TEDDY cohort showed that JMR outperformed the other methods and identified important taxa in infants’ fecal samples with dynamics preceding host disease status. CONCLUSION: Our method JMR is a robust framework that models taxon-specific trajectory and host disease status for matched participants without transformation of relative abundance, improving the power of detecting disease-associated microbial features in certain scenarios. JMR is available in R package mtradeR at https://github.com/qianli10000/mtradeR. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08890-1.
format Online
Article
Text
id pubmed-9484160
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-94841602022-09-20 A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset Li, Qian Vehik, Kendra Li, Cai Triplett, Eric Roesch, Luiz Hu, Yi-Juan Krischer, Jeffrey BMC Genomics Research BACKGROUND: To identify operational taxonomy units (OTUs) signaling disease onset in an observational study, a powerful strategy was selecting participants by matched sets and profiling temporal metagenomes, followed by trajectory analysis. Existing trajectory analyses modeled individual OTU or microbial community without adjusting for the within-community correlation and matched-set-specific latent factors. RESULTS: We proposed a joint model with matching and regularization (JMR) to detect OTU-specific trajectory predictive of host disease status. The between- and within-matched-sets heterogeneity in OTU relative abundance and disease risk were modeled by nested random effects. The inherent negative correlation in microbiota composition was adjusted by incorporating and regularizing the top-correlated taxa as longitudinal covariate, pre-selected by Bray-Curtis distance and elastic net regression. We designed a simulation pipeline to generate true biomarkers for disease onset and the pseudo biomarkers caused by compositionality. We demonstrated that JMR effectively controlled the false discovery and pseudo biomarkers in a simulation study generating temporal high-dimensional metagenomic counts with random intercept or slope. Application of the competing methods in the simulated data and the TEDDY cohort showed that JMR outperformed the other methods and identified important taxa in infants’ fecal samples with dynamics preceding host disease status. CONCLUSION: Our method JMR is a robust framework that models taxon-specific trajectory and host disease status for matched participants without transformation of relative abundance, improving the power of detecting disease-associated microbial features in certain scenarios. JMR is available in R package mtradeR at https://github.com/qianli10000/mtradeR. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08890-1. BioMed Central 2022-09-19 /pmc/articles/PMC9484160/ /pubmed/36123651 http://dx.doi.org/10.1186/s12864-022-08890-1 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Li, Qian
Vehik, Kendra
Li, Cai
Triplett, Eric
Roesch, Luiz
Hu, Yi-Juan
Krischer, Jeffrey
A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset
title A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset
title_full A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset
title_fullStr A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset
title_full_unstemmed A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset
title_short A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset
title_sort robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484160/
https://www.ncbi.nlm.nih.gov/pubmed/36123651
http://dx.doi.org/10.1186/s12864-022-08890-1
work_keys_str_mv AT liqian arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT vehikkendra arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT licai arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT tripletteric arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT roeschluiz arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT huyijuan arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT krischerjeffrey arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT liqian robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT vehikkendra robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT licai robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT tripletteric robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT roeschluiz robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT huyijuan robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset
AT krischerjeffrey robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset