Cargando…

The mixed model for the analysis of a repeated‐measurement multivariate count data

Clustered overdispersed multivariate count data are challenging to model due to the presence of correlation within and between samples. Typically, the first source of correlation needs to be addressed but its quantification is of less interest. Here, we focus on the correlation between time points....

Descripción completa

Detalles Bibliográficos
Autores principales:	Martin, Ivonne, Uh, Hae‐Won, Supali, Taniawati, Mitreva, Makedonka, Houwing‐Duistermaat, Jeanine J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	John Wiley and Sons Inc. 2019
Materias:	Research Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6594162/ https://www.ncbi.nlm.nih.gov/pubmed/30761571 http://dx.doi.org/10.1002/sim.8101

_version_	1783430199623286784
author	Martin, Ivonne Uh, Hae‐Won Supali, Taniawati Mitreva, Makedonka Houwing‐Duistermaat, Jeanine J.
author_facet	Martin, Ivonne Uh, Hae‐Won Supali, Taniawati Mitreva, Makedonka Houwing‐Duistermaat, Jeanine J.
author_sort	Martin, Ivonne
collection	PubMed
description	Clustered overdispersed multivariate count data are challenging to model due to the presence of correlation within and between samples. Typically, the first source of correlation needs to be addressed but its quantification is of less interest. Here, we focus on the correlation between time points. In addition, the effects of covariates on the multivariate counts distribution need to be assessed. To fulfill these requirements, a regression model based on the Dirichlet‐multinomial distribution for association between covariates and the categorical counts is extended by using random effects to deal with the additional clustering. This model is the Dirichlet‐multinomial mixed regression model. Alternatively, a negative binomial regression mixed model can be deployed where the corresponding likelihood is conditioned on the total count. It appears that these two approaches are equivalent when the total count is fixed and independent of the random effects. We consider both subject‐specific and categorical‐specific random effects. However, the latter has a larger computational burden when the number of categories increases. Our work is motivated by microbiome data sets obtained by sequencing of the amplicon of the bacterial 16S rRNA gene. These data have a compositional structure and are typically overdispersed. The microbiome data set is from an epidemiological study carried out in a helminth‐endemic area in Indonesia. The conclusions are as follows: time has no statistically significant effect on microbiome composition, the correlation between subjects is statistically significant, and treatment has a significant effect on the microbiome composition only in infected subjects who remained infected.
format	Online Article Text
id	pubmed-6594162
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	John Wiley and Sons Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-65941622019-07-10 The mixed model for the analysis of a repeated‐measurement multivariate count data Martin, Ivonne Uh, Hae‐Won Supali, Taniawati Mitreva, Makedonka Houwing‐Duistermaat, Jeanine J. Stat Med Research Articles Clustered overdispersed multivariate count data are challenging to model due to the presence of correlation within and between samples. Typically, the first source of correlation needs to be addressed but its quantification is of less interest. Here, we focus on the correlation between time points. In addition, the effects of covariates on the multivariate counts distribution need to be assessed. To fulfill these requirements, a regression model based on the Dirichlet‐multinomial distribution for association between covariates and the categorical counts is extended by using random effects to deal with the additional clustering. This model is the Dirichlet‐multinomial mixed regression model. Alternatively, a negative binomial regression mixed model can be deployed where the corresponding likelihood is conditioned on the total count. It appears that these two approaches are equivalent when the total count is fixed and independent of the random effects. We consider both subject‐specific and categorical‐specific random effects. However, the latter has a larger computational burden when the number of categories increases. Our work is motivated by microbiome data sets obtained by sequencing of the amplicon of the bacterial 16S rRNA gene. These data have a compositional structure and are typically overdispersed. The microbiome data set is from an epidemiological study carried out in a helminth‐endemic area in Indonesia. The conclusions are as follows: time has no statistically significant effect on microbiome composition, the correlation between subjects is statistically significant, and treatment has a significant effect on the microbiome composition only in infected subjects who remained infected. John Wiley and Sons Inc. 2019-02-13 2019-05-30 /pmc/articles/PMC6594162/ /pubmed/30761571 http://dx.doi.org/10.1002/sim.8101 Text en © 2019 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle	Research Articles Martin, Ivonne Uh, Hae‐Won Supali, Taniawati Mitreva, Makedonka Houwing‐Duistermaat, Jeanine J. The mixed model for the analysis of a repeated‐measurement multivariate count data
title	The mixed model for the analysis of a repeated‐measurement multivariate count data
title_full	The mixed model for the analysis of a repeated‐measurement multivariate count data
title_fullStr	The mixed model for the analysis of a repeated‐measurement multivariate count data
title_full_unstemmed	The mixed model for the analysis of a repeated‐measurement multivariate count data
title_short	The mixed model for the analysis of a repeated‐measurement multivariate count data
title_sort	mixed model for the analysis of a repeated‐measurement multivariate count data
topic	Research Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6594162/ https://www.ncbi.nlm.nih.gov/pubmed/30761571 http://dx.doi.org/10.1002/sim.8101
work_keys_str_mv	AT martinivonne themixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT uhhaewon themixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT supalitaniawati themixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT mitrevamakedonka themixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT houwingduistermaatjeaninej themixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT martinivonne mixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT uhhaewon mixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT supalitaniawati mixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT mitrevamakedonka mixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata AT houwingduistermaatjeaninej mixedmodelfortheanalysisofarepeatedmeasurementmultivariatecountdata

The mixed model for the analysis of a repeated‐measurement multivariate count data

Ejemplares similares