Cargando…

Negative binomial mixed models for analyzing microbiome count data

BACKGROUND: Recent advances in next-generation sequencing (NGS) technology enable researchers to collect a large volume of metagenomic sequencing data. These data provide valuable resources for investigating interactions between the microbiome and host environmental/clinical factors. In addition to...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xinyan, Mallick, Himel, Tang, Zaixiang, Zhang, Lei, Cui, Xiangqin, Benson, Andrew K., Yi, Nengjun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209949/
https://www.ncbi.nlm.nih.gov/pubmed/28049409
http://dx.doi.org/10.1186/s12859-016-1441-7
_version_ 1782490827559272448
author Zhang, Xinyan
Mallick, Himel
Tang, Zaixiang
Zhang, Lei
Cui, Xiangqin
Benson, Andrew K.
Yi, Nengjun
author_facet Zhang, Xinyan
Mallick, Himel
Tang, Zaixiang
Zhang, Lei
Cui, Xiangqin
Benson, Andrew K.
Yi, Nengjun
author_sort Zhang, Xinyan
collection PubMed
description BACKGROUND: Recent advances in next-generation sequencing (NGS) technology enable researchers to collect a large volume of metagenomic sequencing data. These data provide valuable resources for investigating interactions between the microbiome and host environmental/clinical factors. In addition to the well-known properties of microbiome count measurements, for example, varied total sequence reads across samples, over-dispersion and zero-inflation, microbiome studies usually collect samples with hierarchical structures, which introduce correlation among the samples and thus further complicate the analysis and interpretation of microbiome count data. RESULTS: In this article, we propose negative binomial mixed models (NBMMs) for detecting the association between the microbiome and host environmental/clinical factors for correlated microbiome count data. Although having not dealt with zero-inflation, the proposed mixed-effects models account for correlation among the samples by incorporating random effects into the commonly used fixed-effects negative binomial model, and can efficiently handle over-dispersion and varying total reads. We have developed a flexible and efficient IWLS (Iterative Weighted Least Squares) algorithm to fit the proposed NBMMs by taking advantage of the standard procedure for fitting the linear mixed models. CONCLUSIONS: We evaluate and demonstrate the proposed method via extensive simulation studies and the application to mouse gut microbiome data. The results show that the proposed method has desirable properties and outperform the previously used methods in terms of both empirical power and Type I error. The method has been incorporated into the freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/ and http://github.com/abbyyan3/BhGLM), providing a useful tool for analyzing microbiome data.
format Online
Article
Text
id pubmed-5209949
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52099492017-01-04 Negative binomial mixed models for analyzing microbiome count data Zhang, Xinyan Mallick, Himel Tang, Zaixiang Zhang, Lei Cui, Xiangqin Benson, Andrew K. Yi, Nengjun BMC Bioinformatics Methodology Article BACKGROUND: Recent advances in next-generation sequencing (NGS) technology enable researchers to collect a large volume of metagenomic sequencing data. These data provide valuable resources for investigating interactions between the microbiome and host environmental/clinical factors. In addition to the well-known properties of microbiome count measurements, for example, varied total sequence reads across samples, over-dispersion and zero-inflation, microbiome studies usually collect samples with hierarchical structures, which introduce correlation among the samples and thus further complicate the analysis and interpretation of microbiome count data. RESULTS: In this article, we propose negative binomial mixed models (NBMMs) for detecting the association between the microbiome and host environmental/clinical factors for correlated microbiome count data. Although having not dealt with zero-inflation, the proposed mixed-effects models account for correlation among the samples by incorporating random effects into the commonly used fixed-effects negative binomial model, and can efficiently handle over-dispersion and varying total reads. We have developed a flexible and efficient IWLS (Iterative Weighted Least Squares) algorithm to fit the proposed NBMMs by taking advantage of the standard procedure for fitting the linear mixed models. CONCLUSIONS: We evaluate and demonstrate the proposed method via extensive simulation studies and the application to mouse gut microbiome data. The results show that the proposed method has desirable properties and outperform the previously used methods in terms of both empirical power and Type I error. The method has been incorporated into the freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/ and http://github.com/abbyyan3/BhGLM), providing a useful tool for analyzing microbiome data. BioMed Central 2017-01-03 /pmc/articles/PMC5209949/ /pubmed/28049409 http://dx.doi.org/10.1186/s12859-016-1441-7 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Zhang, Xinyan
Mallick, Himel
Tang, Zaixiang
Zhang, Lei
Cui, Xiangqin
Benson, Andrew K.
Yi, Nengjun
Negative binomial mixed models for analyzing microbiome count data
title Negative binomial mixed models for analyzing microbiome count data
title_full Negative binomial mixed models for analyzing microbiome count data
title_fullStr Negative binomial mixed models for analyzing microbiome count data
title_full_unstemmed Negative binomial mixed models for analyzing microbiome count data
title_short Negative binomial mixed models for analyzing microbiome count data
title_sort negative binomial mixed models for analyzing microbiome count data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209949/
https://www.ncbi.nlm.nih.gov/pubmed/28049409
http://dx.doi.org/10.1186/s12859-016-1441-7
work_keys_str_mv AT zhangxinyan negativebinomialmixedmodelsforanalyzingmicrobiomecountdata
AT mallickhimel negativebinomialmixedmodelsforanalyzingmicrobiomecountdata
AT tangzaixiang negativebinomialmixedmodelsforanalyzingmicrobiomecountdata
AT zhanglei negativebinomialmixedmodelsforanalyzingmicrobiomecountdata
AT cuixiangqin negativebinomialmixedmodelsforanalyzingmicrobiomecountdata
AT bensonandrewk negativebinomialmixedmodelsforanalyzingmicrobiomecountdata
AT yinengjun negativebinomialmixedmodelsforanalyzingmicrobiomecountdata