Cargando…
A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset
BACKGROUND: To identify operational taxonomy units (OTUs) signaling disease onset in an observational study, a powerful strategy was selecting participants by matched sets and profiling temporal metagenomes, followed by trajectory analysis. Existing trajectory analyses modeled individual OTU or micr...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484160/ https://www.ncbi.nlm.nih.gov/pubmed/36123651 http://dx.doi.org/10.1186/s12864-022-08890-1 |
_version_ | 1784791825916100608 |
---|---|
author | Li, Qian Vehik, Kendra Li, Cai Triplett, Eric Roesch, Luiz Hu, Yi-Juan Krischer, Jeffrey |
author_facet | Li, Qian Vehik, Kendra Li, Cai Triplett, Eric Roesch, Luiz Hu, Yi-Juan Krischer, Jeffrey |
author_sort | Li, Qian |
collection | PubMed |
description | BACKGROUND: To identify operational taxonomy units (OTUs) signaling disease onset in an observational study, a powerful strategy was selecting participants by matched sets and profiling temporal metagenomes, followed by trajectory analysis. Existing trajectory analyses modeled individual OTU or microbial community without adjusting for the within-community correlation and matched-set-specific latent factors. RESULTS: We proposed a joint model with matching and regularization (JMR) to detect OTU-specific trajectory predictive of host disease status. The between- and within-matched-sets heterogeneity in OTU relative abundance and disease risk were modeled by nested random effects. The inherent negative correlation in microbiota composition was adjusted by incorporating and regularizing the top-correlated taxa as longitudinal covariate, pre-selected by Bray-Curtis distance and elastic net regression. We designed a simulation pipeline to generate true biomarkers for disease onset and the pseudo biomarkers caused by compositionality. We demonstrated that JMR effectively controlled the false discovery and pseudo biomarkers in a simulation study generating temporal high-dimensional metagenomic counts with random intercept or slope. Application of the competing methods in the simulated data and the TEDDY cohort showed that JMR outperformed the other methods and identified important taxa in infants’ fecal samples with dynamics preceding host disease status. CONCLUSION: Our method JMR is a robust framework that models taxon-specific trajectory and host disease status for matched participants without transformation of relative abundance, improving the power of detecting disease-associated microbial features in certain scenarios. JMR is available in R package mtradeR at https://github.com/qianli10000/mtradeR. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08890-1. |
format | Online Article Text |
id | pubmed-9484160 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-94841602022-09-20 A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset Li, Qian Vehik, Kendra Li, Cai Triplett, Eric Roesch, Luiz Hu, Yi-Juan Krischer, Jeffrey BMC Genomics Research BACKGROUND: To identify operational taxonomy units (OTUs) signaling disease onset in an observational study, a powerful strategy was selecting participants by matched sets and profiling temporal metagenomes, followed by trajectory analysis. Existing trajectory analyses modeled individual OTU or microbial community without adjusting for the within-community correlation and matched-set-specific latent factors. RESULTS: We proposed a joint model with matching and regularization (JMR) to detect OTU-specific trajectory predictive of host disease status. The between- and within-matched-sets heterogeneity in OTU relative abundance and disease risk were modeled by nested random effects. The inherent negative correlation in microbiota composition was adjusted by incorporating and regularizing the top-correlated taxa as longitudinal covariate, pre-selected by Bray-Curtis distance and elastic net regression. We designed a simulation pipeline to generate true biomarkers for disease onset and the pseudo biomarkers caused by compositionality. We demonstrated that JMR effectively controlled the false discovery and pseudo biomarkers in a simulation study generating temporal high-dimensional metagenomic counts with random intercept or slope. Application of the competing methods in the simulated data and the TEDDY cohort showed that JMR outperformed the other methods and identified important taxa in infants’ fecal samples with dynamics preceding host disease status. CONCLUSION: Our method JMR is a robust framework that models taxon-specific trajectory and host disease status for matched participants without transformation of relative abundance, improving the power of detecting disease-associated microbial features in certain scenarios. JMR is available in R package mtradeR at https://github.com/qianli10000/mtradeR. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08890-1. BioMed Central 2022-09-19 /pmc/articles/PMC9484160/ /pubmed/36123651 http://dx.doi.org/10.1186/s12864-022-08890-1 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Li, Qian Vehik, Kendra Li, Cai Triplett, Eric Roesch, Luiz Hu, Yi-Juan Krischer, Jeffrey A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset |
title | A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset |
title_full | A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset |
title_fullStr | A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset |
title_full_unstemmed | A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset |
title_short | A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset |
title_sort | robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484160/ https://www.ncbi.nlm.nih.gov/pubmed/36123651 http://dx.doi.org/10.1186/s12864-022-08890-1 |
work_keys_str_mv | AT liqian arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT vehikkendra arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT licai arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT tripletteric arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT roeschluiz arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT huyijuan arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT krischerjeffrey arobustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT liqian robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT vehikkendra robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT licai robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT tripletteric robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT roeschluiz robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT huyijuan robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset AT krischerjeffrey robustandtransformationfreejointmodelwithmatchingandregularizationformetagenomictrajectoryanddiseaseonset |