Cargando…

Shrinkage improves estimation of microbial associations under different normalization methods

Estimation of statistical associations in microbial genomic survey count data is fundamental to microbiome research. Experimental limitations, including count compositionality, low sample sizes and technical variability, obstruct standard application of association measures and require data normaliz...

Descripción completa

Detalles Bibliográficos
Autores principales: Badri, Michelle, Kurtz, Zachary D, Bonneau, Richard, Müller, Christian L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7745771/
https://www.ncbi.nlm.nih.gov/pubmed/33575644
http://dx.doi.org/10.1093/nargab/lqaa100
_version_ 1783624669741449216
author Badri, Michelle
Kurtz, Zachary D
Bonneau, Richard
Müller, Christian L
author_facet Badri, Michelle
Kurtz, Zachary D
Bonneau, Richard
Müller, Christian L
author_sort Badri, Michelle
collection PubMed
description Estimation of statistical associations in microbial genomic survey count data is fundamental to microbiome research. Experimental limitations, including count compositionality, low sample sizes and technical variability, obstruct standard application of association measures and require data normalization prior to statistical estimation. Here, we investigate the interplay between data normalization, microbial association estimation and available sample size by leveraging the large-scale American Gut Project (AGP) survey data. We analyze the statistical properties of two prominent linear association estimators, correlation and proportionality, under different sample scenarios and data normalization schemes, including RNA-seq analysis workflows and log-ratio transformations. We show that shrinkage estimation, a standard statistical regularization technique, can universally improve the quality of taxon–taxon association estimates for microbiome data. We find that large-scale association patterns in the AGP data can be grouped into five normalization-dependent classes. Using microbial association network construction and clustering as downstream data analysis examples, we show that variance-stabilizing and log-ratio approaches enable the most taxonomically and structurally coherent estimates. Taken together, the findings from our reproducible analysis workflow have important implications for microbiome studies in multiple stages of analysis, particularly when only small sample sizes are available.
format Online
Article
Text
id pubmed-7745771
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77457712021-02-10 Shrinkage improves estimation of microbial associations under different normalization methods Badri, Michelle Kurtz, Zachary D Bonneau, Richard Müller, Christian L NAR Genom Bioinform Methods Article Estimation of statistical associations in microbial genomic survey count data is fundamental to microbiome research. Experimental limitations, including count compositionality, low sample sizes and technical variability, obstruct standard application of association measures and require data normalization prior to statistical estimation. Here, we investigate the interplay between data normalization, microbial association estimation and available sample size by leveraging the large-scale American Gut Project (AGP) survey data. We analyze the statistical properties of two prominent linear association estimators, correlation and proportionality, under different sample scenarios and data normalization schemes, including RNA-seq analysis workflows and log-ratio transformations. We show that shrinkage estimation, a standard statistical regularization technique, can universally improve the quality of taxon–taxon association estimates for microbiome data. We find that large-scale association patterns in the AGP data can be grouped into five normalization-dependent classes. Using microbial association network construction and clustering as downstream data analysis examples, we show that variance-stabilizing and log-ratio approaches enable the most taxonomically and structurally coherent estimates. Taken together, the findings from our reproducible analysis workflow have important implications for microbiome studies in multiple stages of analysis, particularly when only small sample sizes are available. Oxford University Press 2020-12-17 /pmc/articles/PMC7745771/ /pubmed/33575644 http://dx.doi.org/10.1093/nargab/lqaa100 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Article
Badri, Michelle
Kurtz, Zachary D
Bonneau, Richard
Müller, Christian L
Shrinkage improves estimation of microbial associations under different normalization methods
title Shrinkage improves estimation of microbial associations under different normalization methods
title_full Shrinkage improves estimation of microbial associations under different normalization methods
title_fullStr Shrinkage improves estimation of microbial associations under different normalization methods
title_full_unstemmed Shrinkage improves estimation of microbial associations under different normalization methods
title_short Shrinkage improves estimation of microbial associations under different normalization methods
title_sort shrinkage improves estimation of microbial associations under different normalization methods
topic Methods Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7745771/
https://www.ncbi.nlm.nih.gov/pubmed/33575644
http://dx.doi.org/10.1093/nargab/lqaa100
work_keys_str_mv AT badrimichelle shrinkageimprovesestimationofmicrobialassociationsunderdifferentnormalizationmethods
AT kurtzzacharyd shrinkageimprovesestimationofmicrobialassociationsunderdifferentnormalizationmethods
AT bonneaurichard shrinkageimprovesestimationofmicrobialassociationsunderdifferentnormalizationmethods
AT mullerchristianl shrinkageimprovesestimationofmicrobialassociationsunderdifferentnormalizationmethods