Cargando…

Beta-diversity distance matrices for microbiome sample size and power calculations — How to obtain good estimates

In microbiome studies, researchers often wish to compare the taxa count distributions between groups of samples. Commonly-used corresponding methods of analysis are built on examining distance matrices, where distances describe the beta-diversity between samples. Analyses then compare the distributi...

Descripción completa

Detalles Bibliográficos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9133771/
https://www.ncbi.nlm.nih.gov/pubmed/35664226
http://dx.doi.org/10.1016/j.csbj.2022.04.032
_version_ 1784713644308692992
collection PubMed
description In microbiome studies, researchers often wish to compare the taxa count distributions between groups of samples. Commonly-used corresponding methods of analysis are built on examining distance matrices, where distances describe the beta-diversity between samples. Analyses then compare the distribution of distances within groups to the distributions between groups. However, when performing a priori sample size or power calculations for such study designs, appropriate within and between group distance distributions can be challenging to obtain. When available, pilot study data, or data from prior studies of similar design should provide realistic distance estimates. However, when these are not available, distances can be extracted from available studies where one can assume similar beta-diversity. Alternatively, distances can be generated by simulation methods. Here, we describe and illustrate these three strategies for obtaining realistic distance matrices. For simulation methods, we illustrate the procedures required starting from existing benchmark data, as well as how to simulate directly from population assumptions. Using data from the American Gut project, we provide tables of observed distances for use by researchers planning their own studies, as well as R codes for generating similar matrices in other datasets. Furthermore, for simulated data, we compare methods, provide R codes, and demonstrate how challenging it is to obtain realistic distance distributions without any benchmark data. This code and illustrative distance tables are provided by the IMPACTT Consortium as a resource to the microbiome research community.
format Online
Article
Text
id pubmed-9133771
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-91337712022-06-04 Beta-diversity distance matrices for microbiome sample size and power calculations — How to obtain good estimates Comput Struct Biotechnol J Method Article In microbiome studies, researchers often wish to compare the taxa count distributions between groups of samples. Commonly-used corresponding methods of analysis are built on examining distance matrices, where distances describe the beta-diversity between samples. Analyses then compare the distribution of distances within groups to the distributions between groups. However, when performing a priori sample size or power calculations for such study designs, appropriate within and between group distance distributions can be challenging to obtain. When available, pilot study data, or data from prior studies of similar design should provide realistic distance estimates. However, when these are not available, distances can be extracted from available studies where one can assume similar beta-diversity. Alternatively, distances can be generated by simulation methods. Here, we describe and illustrate these three strategies for obtaining realistic distance matrices. For simulation methods, we illustrate the procedures required starting from existing benchmark data, as well as how to simulate directly from population assumptions. Using data from the American Gut project, we provide tables of observed distances for use by researchers planning their own studies, as well as R codes for generating similar matrices in other datasets. Furthermore, for simulated data, we compare methods, provide R codes, and demonstrate how challenging it is to obtain realistic distance distributions without any benchmark data. This code and illustrative distance tables are provided by the IMPACTT Consortium as a resource to the microbiome research community. Research Network of Computational and Structural Biotechnology 2022-04-27 /pmc/articles/PMC9133771/ /pubmed/35664226 http://dx.doi.org/10.1016/j.csbj.2022.04.032 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Method Article
Beta-diversity distance matrices for microbiome sample size and power calculations — How to obtain good estimates
title Beta-diversity distance matrices for microbiome sample size and power calculations — How to obtain good estimates
title_full Beta-diversity distance matrices for microbiome sample size and power calculations — How to obtain good estimates
title_fullStr Beta-diversity distance matrices for microbiome sample size and power calculations — How to obtain good estimates
title_full_unstemmed Beta-diversity distance matrices for microbiome sample size and power calculations — How to obtain good estimates
title_short Beta-diversity distance matrices for microbiome sample size and power calculations — How to obtain good estimates
title_sort beta-diversity distance matrices for microbiome sample size and power calculations — how to obtain good estimates
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9133771/
https://www.ncbi.nlm.nih.gov/pubmed/35664226
http://dx.doi.org/10.1016/j.csbj.2022.04.032
work_keys_str_mv AT betadiversitydistancematricesformicrobiomesamplesizeandpowercalculationshowtoobtaingoodestimates