Cargando…

Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data

BACKGROUND: Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for stati...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Zhengyi, Satten, Glen A., Mitchell, Caroline, Hu, Yi-Juan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191060/
https://www.ncbi.nlm.nih.gov/pubmed/34108046
http://dx.doi.org/10.1186/s40168-021-01034-9
_version_ 1783705802573348864
author Zhu, Zhengyi
Satten, Glen A.
Mitchell, Caroline
Hu, Yi-Juan
author_facet Zhu, Zhengyi
Satten, Glen A.
Mitchell, Caroline
Hu, Yi-Juan
author_sort Zhu, Zhengyi
collection PubMed
description BACKGROUND: Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for statistical methods for data comprised of matched sets, to test hypotheses against traits of interest (e.g., clinical outcomes or environmental factors) at the community level and/or the operational taxonomic unit (OTU) level. Optimally, these methods should accommodate complex data such as those with unequal sample sizes across sets, confounders varying within sets, and continuous traits of interest. METHODS: PERMANOVA is a commonly used distance-based method for testing hypotheses at the community level. We have also developed the linear decomposition model (LDM) that unifies the community-level and OTU-level tests into one framework. Here we present a new strategy that can be used with both PERMANOVA and the LDM for analyzing matched-set data. We propose to include an indicator variable for each set as covariates, so as to constrain comparisons between samples within a set, and also permute traits within each set, which can account for exchangeable sample correlations. The flexible nature of PERMANOVA and the LDM allows discrete or continuous traits or interactions to be tested, within-set confounders to be adjusted, and unbalanced data to be fully exploited. RESULTS: Our simulations indicate that our proposed strategy outperformed alternative strategies, including the commonly used one that utilizes restricted permutation only, in a wide range of scenarios. Using simulation, we also explored optimal designs for matched-set studies. The flexibility of PERMANOVA and the LDM for a variety of matched-set microbiome data is illustrated by the analysis of data from two real studies. CONCLUSIONS: Including set indicator variables and permuting within sets when analyzing matched-set data with PERMANOVA or the LDM is a strategy that performs well and is capable of handling the complex data structures that frequently occur in microbiome studies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s40168-021-01034-9).
format Online
Article
Text
id pubmed-8191060
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81910602021-06-10 Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data Zhu, Zhengyi Satten, Glen A. Mitchell, Caroline Hu, Yi-Juan Microbiome Methodology BACKGROUND: Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for statistical methods for data comprised of matched sets, to test hypotheses against traits of interest (e.g., clinical outcomes or environmental factors) at the community level and/or the operational taxonomic unit (OTU) level. Optimally, these methods should accommodate complex data such as those with unequal sample sizes across sets, confounders varying within sets, and continuous traits of interest. METHODS: PERMANOVA is a commonly used distance-based method for testing hypotheses at the community level. We have also developed the linear decomposition model (LDM) that unifies the community-level and OTU-level tests into one framework. Here we present a new strategy that can be used with both PERMANOVA and the LDM for analyzing matched-set data. We propose to include an indicator variable for each set as covariates, so as to constrain comparisons between samples within a set, and also permute traits within each set, which can account for exchangeable sample correlations. The flexible nature of PERMANOVA and the LDM allows discrete or continuous traits or interactions to be tested, within-set confounders to be adjusted, and unbalanced data to be fully exploited. RESULTS: Our simulations indicate that our proposed strategy outperformed alternative strategies, including the commonly used one that utilizes restricted permutation only, in a wide range of scenarios. Using simulation, we also explored optimal designs for matched-set studies. The flexibility of PERMANOVA and the LDM for a variety of matched-set microbiome data is illustrated by the analysis of data from two real studies. CONCLUSIONS: Including set indicator variables and permuting within sets when analyzing matched-set data with PERMANOVA or the LDM is a strategy that performs well and is capable of handling the complex data structures that frequently occur in microbiome studies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s40168-021-01034-9). BioMed Central 2021-06-09 /pmc/articles/PMC8191060/ /pubmed/34108046 http://dx.doi.org/10.1186/s40168-021-01034-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Zhu, Zhengyi
Satten, Glen A.
Mitchell, Caroline
Hu, Yi-Juan
Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data
title Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data
title_full Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data
title_fullStr Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data
title_full_unstemmed Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data
title_short Constraining PERMANOVA and LDM to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data
title_sort constraining permanova and ldm to within-set comparisons by projection improves the efficiency of analyses of matched sets of microbiome data
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191060/
https://www.ncbi.nlm.nih.gov/pubmed/34108046
http://dx.doi.org/10.1186/s40168-021-01034-9
work_keys_str_mv AT zhuzhengyi constrainingpermanovaandldmtowithinsetcomparisonsbyprojectionimprovestheefficiencyofanalysesofmatchedsetsofmicrobiomedata
AT sattenglena constrainingpermanovaandldmtowithinsetcomparisonsbyprojectionimprovestheefficiencyofanalysesofmatchedsetsofmicrobiomedata
AT mitchellcaroline constrainingpermanovaandldmtowithinsetcomparisonsbyprojectionimprovestheefficiencyofanalysesofmatchedsetsofmicrobiomedata
AT huyijuan constrainingpermanovaandldmtowithinsetcomparisonsbyprojectionimprovestheefficiencyofanalysesofmatchedsetsofmicrobiomedata