Cargando…

Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data

Typical data in a microbiome study consist of the operational taxonomic unit (OTU) counts that have the characteristic of excess zeros, which are often ignored by investigators. In this paper, we compare the performance of different competing methods to model data with zero inflated features through...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Lizhen, Paterson, Andrew D., Turpin, Williams, Xu, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4493133/
https://www.ncbi.nlm.nih.gov/pubmed/26148172
http://dx.doi.org/10.1371/journal.pone.0129606
_version_ 1782379871998050304
author Xu, Lizhen
Paterson, Andrew D.
Turpin, Williams
Xu, Wei
author_facet Xu, Lizhen
Paterson, Andrew D.
Turpin, Williams
Xu, Wei
author_sort Xu, Lizhen
collection PubMed
description Typical data in a microbiome study consist of the operational taxonomic unit (OTU) counts that have the characteristic of excess zeros, which are often ignored by investigators. In this paper, we compare the performance of different competing methods to model data with zero inflated features through extensive simulations and application to a microbiome study. These methods include standard parametric and non-parametric models, hurdle models, and zero inflated models. We examine varying degrees of zero inflation, with or without dispersion in the count component, as well as different magnitude and direction of the covariate effect on structural zeros and the count components. We focus on the assessment of type I error, power to detect the overall covariate effect, measures of model fit, and bias and effectiveness of parameter estimations. We also evaluate the abilities of model selection strategies using Akaike information criterion (AIC) or Vuong test to identify the correct model. The simulation studies show that hurdle and zero inflated models have well controlled type I errors, higher power, better goodness of fit measures, and are more accurate and efficient in the parameter estimation. Besides that, the hurdle models have similar goodness of fit and parameter estimation for the count component as their corresponding zero inflated models. However, the estimation and interpretation of the parameters for the zero components differs, and hurdle models are more stable when structural zeros are absent. We then discuss the model selection strategy for zero inflated data and implement it in a gut microbiome study of > 400 independent subjects.
format Online
Article
Text
id pubmed-4493133
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44931332015-07-15 Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data Xu, Lizhen Paterson, Andrew D. Turpin, Williams Xu, Wei PLoS One Research Article Typical data in a microbiome study consist of the operational taxonomic unit (OTU) counts that have the characteristic of excess zeros, which are often ignored by investigators. In this paper, we compare the performance of different competing methods to model data with zero inflated features through extensive simulations and application to a microbiome study. These methods include standard parametric and non-parametric models, hurdle models, and zero inflated models. We examine varying degrees of zero inflation, with or without dispersion in the count component, as well as different magnitude and direction of the covariate effect on structural zeros and the count components. We focus on the assessment of type I error, power to detect the overall covariate effect, measures of model fit, and bias and effectiveness of parameter estimations. We also evaluate the abilities of model selection strategies using Akaike information criterion (AIC) or Vuong test to identify the correct model. The simulation studies show that hurdle and zero inflated models have well controlled type I errors, higher power, better goodness of fit measures, and are more accurate and efficient in the parameter estimation. Besides that, the hurdle models have similar goodness of fit and parameter estimation for the count component as their corresponding zero inflated models. However, the estimation and interpretation of the parameters for the zero components differs, and hurdle models are more stable when structural zeros are absent. We then discuss the model selection strategy for zero inflated data and implement it in a gut microbiome study of > 400 independent subjects. Public Library of Science 2015-07-06 /pmc/articles/PMC4493133/ /pubmed/26148172 http://dx.doi.org/10.1371/journal.pone.0129606 Text en © 2015 Xu et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xu, Lizhen
Paterson, Andrew D.
Turpin, Williams
Xu, Wei
Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
title Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
title_full Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
title_fullStr Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
title_full_unstemmed Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
title_short Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
title_sort assessment and selection of competing models for zero-inflated microbiome data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4493133/
https://www.ncbi.nlm.nih.gov/pubmed/26148172
http://dx.doi.org/10.1371/journal.pone.0129606
work_keys_str_mv AT xulizhen assessmentandselectionofcompetingmodelsforzeroinflatedmicrobiomedata
AT patersonandrewd assessmentandselectionofcompetingmodelsforzeroinflatedmicrobiomedata
AT turpinwilliams assessmentandselectionofcompetingmodelsforzeroinflatedmicrobiomedata
AT xuwei assessmentandselectionofcompetingmodelsforzeroinflatedmicrobiomedata