Cargando…
Shrinkage improves estimation of microbial associations under different normalization methods
Estimation of statistical associations in microbial genomic survey count data is fundamental to microbiome research. Experimental limitations, including count compositionality, low sample sizes and technical variability, obstruct standard application of association measures and require data normaliz...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7745771/ https://www.ncbi.nlm.nih.gov/pubmed/33575644 http://dx.doi.org/10.1093/nargab/lqaa100 |
_version_ | 1783624669741449216 |
---|---|
author | Badri, Michelle Kurtz, Zachary D Bonneau, Richard Müller, Christian L |
author_facet | Badri, Michelle Kurtz, Zachary D Bonneau, Richard Müller, Christian L |
author_sort | Badri, Michelle |
collection | PubMed |
description | Estimation of statistical associations in microbial genomic survey count data is fundamental to microbiome research. Experimental limitations, including count compositionality, low sample sizes and technical variability, obstruct standard application of association measures and require data normalization prior to statistical estimation. Here, we investigate the interplay between data normalization, microbial association estimation and available sample size by leveraging the large-scale American Gut Project (AGP) survey data. We analyze the statistical properties of two prominent linear association estimators, correlation and proportionality, under different sample scenarios and data normalization schemes, including RNA-seq analysis workflows and log-ratio transformations. We show that shrinkage estimation, a standard statistical regularization technique, can universally improve the quality of taxon–taxon association estimates for microbiome data. We find that large-scale association patterns in the AGP data can be grouped into five normalization-dependent classes. Using microbial association network construction and clustering as downstream data analysis examples, we show that variance-stabilizing and log-ratio approaches enable the most taxonomically and structurally coherent estimates. Taken together, the findings from our reproducible analysis workflow have important implications for microbiome studies in multiple stages of analysis, particularly when only small sample sizes are available. |
format | Online Article Text |
id | pubmed-7745771 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-77457712021-02-10 Shrinkage improves estimation of microbial associations under different normalization methods Badri, Michelle Kurtz, Zachary D Bonneau, Richard Müller, Christian L NAR Genom Bioinform Methods Article Estimation of statistical associations in microbial genomic survey count data is fundamental to microbiome research. Experimental limitations, including count compositionality, low sample sizes and technical variability, obstruct standard application of association measures and require data normalization prior to statistical estimation. Here, we investigate the interplay between data normalization, microbial association estimation and available sample size by leveraging the large-scale American Gut Project (AGP) survey data. We analyze the statistical properties of two prominent linear association estimators, correlation and proportionality, under different sample scenarios and data normalization schemes, including RNA-seq analysis workflows and log-ratio transformations. We show that shrinkage estimation, a standard statistical regularization technique, can universally improve the quality of taxon–taxon association estimates for microbiome data. We find that large-scale association patterns in the AGP data can be grouped into five normalization-dependent classes. Using microbial association network construction and clustering as downstream data analysis examples, we show that variance-stabilizing and log-ratio approaches enable the most taxonomically and structurally coherent estimates. Taken together, the findings from our reproducible analysis workflow have important implications for microbiome studies in multiple stages of analysis, particularly when only small sample sizes are available. Oxford University Press 2020-12-17 /pmc/articles/PMC7745771/ /pubmed/33575644 http://dx.doi.org/10.1093/nargab/lqaa100 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Article Badri, Michelle Kurtz, Zachary D Bonneau, Richard Müller, Christian L Shrinkage improves estimation of microbial associations under different normalization methods |
title | Shrinkage improves estimation of microbial associations under different normalization methods |
title_full | Shrinkage improves estimation of microbial associations under different normalization methods |
title_fullStr | Shrinkage improves estimation of microbial associations under different normalization methods |
title_full_unstemmed | Shrinkage improves estimation of microbial associations under different normalization methods |
title_short | Shrinkage improves estimation of microbial associations under different normalization methods |
title_sort | shrinkage improves estimation of microbial associations under different normalization methods |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7745771/ https://www.ncbi.nlm.nih.gov/pubmed/33575644 http://dx.doi.org/10.1093/nargab/lqaa100 |
work_keys_str_mv | AT badrimichelle shrinkageimprovesestimationofmicrobialassociationsunderdifferentnormalizationmethods AT kurtzzacharyd shrinkageimprovesestimationofmicrobialassociationsunderdifferentnormalizationmethods AT bonneaurichard shrinkageimprovesestimationofmicrobialassociationsunderdifferentnormalizationmethods AT mullerchristianl shrinkageimprovesestimationofmicrobialassociationsunderdifferentnormalizationmethods |