Cargando…

Negative binomial factor regression with application to microbiome data analysis

The human microbiome provides essential physiological functions and helps maintain host homeostasis via the formation of intricate ecological host‐microbiome relationships. While it is well established that the lifestyle of the host, dietary preferences, demographic background, and health status can...

Descripción completa

Detalles Bibliográficos
Autores principales: Mishra, Aditya K., Müller, Christian L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9325477/
https://www.ncbi.nlm.nih.gov/pubmed/35466418
http://dx.doi.org/10.1002/sim.9384
_version_ 1784757061515476992
author Mishra, Aditya K.
Müller, Christian L.
author_facet Mishra, Aditya K.
Müller, Christian L.
author_sort Mishra, Aditya K.
collection PubMed
description The human microbiome provides essential physiological functions and helps maintain host homeostasis via the formation of intricate ecological host‐microbiome relationships. While it is well established that the lifestyle of the host, dietary preferences, demographic background, and health status can influence microbial community composition and dynamics, robust generalizable associations between specific host‐associated factors and specific microbial taxa have remained largely elusive. Here, we propose factor regression models that allow the estimation of structured parsimonious associations between host‐related features and amplicon‐derived microbial taxa. To account for the overdispersed nature of the amplicon sequencing count data, we propose negative binomial reduced rank regression (NB‐RRR) and negative binomial co‐sparse factor regression (NB‐FAR). While NB‐RRR encodes the underlying dependency among the microbial abundances as outcomes and the host‐associated features as predictors through a rank‐constrained coefficient matrix, NB‐FAR uses a sparse singular value decomposition of the coefficient matrix. The latter approach avoids the notoriously difficult joint parameter estimation by extracting sparse unit‐rank components of the coefficient matrix sequentially, effectively delivering interpretable bi‐clusters of taxa and host‐associated factors. To solve the nonconvex optimization problems associated with these factor regression models, we present a novel iterative block‐wise majorization procedure. Extensive simulation studies and an application to the microbial abundance data from the American Gut Project (AGP) demonstrate the efficacy of the proposed procedure. In the AGP data, we identify several factors that strongly link dietary habits and host life style to specific microbial families.
format Online
Article
Text
id pubmed-9325477
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-93254772022-07-30 Negative binomial factor regression with application to microbiome data analysis Mishra, Aditya K. Müller, Christian L. Stat Med Research Articles The human microbiome provides essential physiological functions and helps maintain host homeostasis via the formation of intricate ecological host‐microbiome relationships. While it is well established that the lifestyle of the host, dietary preferences, demographic background, and health status can influence microbial community composition and dynamics, robust generalizable associations between specific host‐associated factors and specific microbial taxa have remained largely elusive. Here, we propose factor regression models that allow the estimation of structured parsimonious associations between host‐related features and amplicon‐derived microbial taxa. To account for the overdispersed nature of the amplicon sequencing count data, we propose negative binomial reduced rank regression (NB‐RRR) and negative binomial co‐sparse factor regression (NB‐FAR). While NB‐RRR encodes the underlying dependency among the microbial abundances as outcomes and the host‐associated features as predictors through a rank‐constrained coefficient matrix, NB‐FAR uses a sparse singular value decomposition of the coefficient matrix. The latter approach avoids the notoriously difficult joint parameter estimation by extracting sparse unit‐rank components of the coefficient matrix sequentially, effectively delivering interpretable bi‐clusters of taxa and host‐associated factors. To solve the nonconvex optimization problems associated with these factor regression models, we present a novel iterative block‐wise majorization procedure. Extensive simulation studies and an application to the microbial abundance data from the American Gut Project (AGP) demonstrate the efficacy of the proposed procedure. In the AGP data, we identify several factors that strongly link dietary habits and host life style to specific microbial families. John Wiley and Sons Inc. 2022-04-24 2022-07-10 /pmc/articles/PMC9325477/ /pubmed/35466418 http://dx.doi.org/10.1002/sim.9384 Text en © 2022 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Research Articles
Mishra, Aditya K.
Müller, Christian L.
Negative binomial factor regression with application to microbiome data analysis
title Negative binomial factor regression with application to microbiome data analysis
title_full Negative binomial factor regression with application to microbiome data analysis
title_fullStr Negative binomial factor regression with application to microbiome data analysis
title_full_unstemmed Negative binomial factor regression with application to microbiome data analysis
title_short Negative binomial factor regression with application to microbiome data analysis
title_sort negative binomial factor regression with application to microbiome data analysis
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9325477/
https://www.ncbi.nlm.nih.gov/pubmed/35466418
http://dx.doi.org/10.1002/sim.9384
work_keys_str_mv AT mishraadityak negativebinomialfactorregressionwithapplicationtomicrobiomedataanalysis
AT mullerchristianl negativebinomialfactorregressionwithapplicationtomicrobiomedataanalysis