Cargando…
Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data
BACKGROUND: Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for stati...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191060/ https://www.ncbi.nlm.nih.gov/pubmed/34108046 http://dx.doi.org/10.1186/s40168-021-01034-9 |
_version_ | 1783705802573348864 |
---|---|
author | Zhu, Zhengyi Satten, Glen A. Mitchell, Caroline Hu, Yi-Juan |
author_facet | Zhu, Zhengyi Satten, Glen A. Mitchell, Caroline Hu, Yi-Juan |
author_sort | Zhu, Zhengyi |
collection | PubMed |
description | BACKGROUND: Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for statistical methods for data comprised of matched sets, to test hypotheses against traits of interest (e.g., clinical outcomes or environmental factors) at the community level and/or the operational taxonomic unit (OTU) level. Optimally, these methods should accommodate complex data such as those with unequal sample sizes across sets, confounders varying within sets, and continuous traits of interest. METHODS: PERMANOVA is a commonly used distance-based method for testing hypotheses at the community level. We have also developed the linear decomposition model (LDM) that unifies the community-level and OTU-level tests into one framework. Here we present a new strategy that can be used with both PERMANOVA and the LDM for analyzing matched-set data. We propose to include an indicator variable for each set as covariates, so as to constrain comparisons between samples within a set, and also permute traits within each set, which can account for exchangeable sample correlations. The flexible nature of PERMANOVA and the LDM allows discrete or continuous traits or interactions to be tested, within-set confounders to be adjusted, and unbalanced data to be fully exploited. RESULTS: Our simulations indicate that our proposed strategy outperformed alternative strategies, including the commonly used one that utilizes restricted permutation only, in a wide range of scenarios. Using simulation, we also explored optimal designs for matched-set studies. The flexibility of PERMANOVA and the LDM for a variety of matched-set microbiome data is illustrated by the analysis of data from two real studies. CONCLUSIONS: Including set indicator variables and permuting within sets when analyzing matched-set data with PERMANOVA or the LDM is a strategy that performs well and is capable of handling the complex data structures that frequently occur in microbiome studies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s40168-021-01034-9). |
format | Online Article Text |
id | pubmed-8191060 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-81910602021-06-10 Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data Zhu, Zhengyi Satten, Glen A. Mitchell, Caroline Hu, Yi-Juan Microbiome Methodology BACKGROUND: Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for statistical methods for data comprised of matched sets, to test hypotheses against traits of interest (e.g., clinical outcomes or environmental factors) at the community level and/or the operational taxonomic unit (OTU) level. Optimally, these methods should accommodate complex data such as those with unequal sample sizes across sets, confounders varying within sets, and continuous traits of interest. METHODS: PERMANOVA is a commonly used distance-based method for testing hypotheses at the community level. We have also developed the linear decomposition model (LDM) that unifies the community-level and OTU-level tests into one framework. Here we present a new strategy that can be used with both PERMANOVA and the LDM for analyzing matched-set data. We propose to include an indicator variable for each set as covariates, so as to constrain comparisons between samples within a set, and also permute traits within each set, which can account for exchangeable sample correlations. The flexible nature of PERMANOVA and the LDM allows discrete or continuous traits or interactions to be tested, within-set confounders to be adjusted, and unbalanced data to be fully exploited. RESULTS: Our simulations indicate that our proposed strategy outperformed alternative strategies, including the commonly used one that utilizes restricted permutation only, in a wide range of scenarios. Using simulation, we also explored optimal designs for matched-set studies. The flexibility of PERMANOVA and the LDM for a variety of matched-set microbiome data is illustrated by the analysis of data from two real studies. CONCLUSIONS: Including set indicator variables and permuting within sets when analyzing matched-set data with PERMANOVA or the LDM is a strategy that performs well and is capable of handling the complex data structures that frequently occur in microbiome studies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s40168-021-01034-9). BioMed Central 2021-06-09 /pmc/articles/PMC8191060/ /pubmed/34108046 http://dx.doi.org/10.1186/s40168-021-01034-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Zhu, Zhengyi Satten, Glen A. Mitchell, Caroline Hu, Yi-Juan Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data |
title | Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data |
title_full | Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data |
title_fullStr | Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data |
title_full_unstemmed | Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data |
title_short | Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data |
title_sort | constraining permanova and ldm to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191060/ https://www.ncbi.nlm.nih.gov/pubmed/34108046 http://dx.doi.org/10.1186/s40168-021-01034-9 |
work_keys_str_mv | AT zhuzhengyi constrainingpermanovaandldmtowithinsetcomparisonsbyprojectionimprovestheefficiencyofanalysesofmatchedsetsofmicrobiomedata AT sattenglena constrainingpermanovaandldmtowithinsetcomparisonsbyprojectionimprovestheefficiencyofanalysesofmatchedsetsofmicrobiomedata AT mitchellcaroline constrainingpermanovaandldmtowithinsetcomparisonsbyprojectionimprovestheefficiencyofanalysesofmatchedsetsofmicrobiomedata AT huyijuan constrainingpermanovaandldmtowithinsetcomparisonsbyprojectionimprovestheefficiencyofanalysesofmatchedsetsofmicrobiomedata |