Cargando…

Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data

BACKGROUND: Study design is a critical aspect of any experiment, and sample size calculations for statistical power that are consistent with that study design are central to robust and reproducible results. However, the existing power calculators for tests of differential expression in single-cell R...

Descripción completa

Detalles Bibliográficos
Autores principales: Zimmerman, Kip D., Langefeld, Carl D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8088563/
https://www.ncbi.nlm.nih.gov/pubmed/33932993
http://dx.doi.org/10.1186/s12864-021-07635-w
_version_ 1783686869195685888
author Zimmerman, Kip D.
Langefeld, Carl D.
author_facet Zimmerman, Kip D.
Langefeld, Carl D.
author_sort Zimmerman, Kip D.
collection PubMed
description BACKGROUND: Study design is a critical aspect of any experiment, and sample size calculations for statistical power that are consistent with that study design are central to robust and reproducible results. However, the existing power calculators for tests of differential expression in single-cell RNA-seq data focus on the total number of cells and not the number of independent experimental units, the true unit of interest for power. Thus, current methods grossly overestimate the power. RESULTS: Hierarchicell is the first single-cell power calculator to explicitly simulate and account for the hierarchical correlation structure (i.e., within sample correlation) that exists in single-cell RNA-seq data. Hierarchicell, an R-package available on GitHub, estimates the within sample correlation structure from real data to simulate hierarchical single-cell RNA-seq data and estimate power for tests of differential expression. This multi-stage approach models gene dropout rates, intra-individual dispersion, inter-individual variation, variable or fixed number of cells per individual, and the correlation among cells within an individual. Without modeling the within sample correlation structure and without properly accounting for the correlation in downstream analysis, we demonstrate that estimates of power are falsely inflated. Hierarchicell can be used to estimate power for binary and continuous phenotypes based on user-specified number of independent experimental units (e.g., individuals) and cells within the experimental unit. CONCLUSIONS: Hierarchicell is a user-friendly R-package that provides accurate estimates of power for testing hypotheses of differential expression in single-cell RNA-seq data. This R-package represents an important addition to single-cell RNA analytic tools and will help researchers design experiments with appropriate and accurate power, increasing discovery and improving robustness and reproducibility. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07635-w.
format Online
Article
Text
id pubmed-8088563
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-80885632021-05-03 Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data Zimmerman, Kip D. Langefeld, Carl D. BMC Genomics Software BACKGROUND: Study design is a critical aspect of any experiment, and sample size calculations for statistical power that are consistent with that study design are central to robust and reproducible results. However, the existing power calculators for tests of differential expression in single-cell RNA-seq data focus on the total number of cells and not the number of independent experimental units, the true unit of interest for power. Thus, current methods grossly overestimate the power. RESULTS: Hierarchicell is the first single-cell power calculator to explicitly simulate and account for the hierarchical correlation structure (i.e., within sample correlation) that exists in single-cell RNA-seq data. Hierarchicell, an R-package available on GitHub, estimates the within sample correlation structure from real data to simulate hierarchical single-cell RNA-seq data and estimate power for tests of differential expression. This multi-stage approach models gene dropout rates, intra-individual dispersion, inter-individual variation, variable or fixed number of cells per individual, and the correlation among cells within an individual. Without modeling the within sample correlation structure and without properly accounting for the correlation in downstream analysis, we demonstrate that estimates of power are falsely inflated. Hierarchicell can be used to estimate power for binary and continuous phenotypes based on user-specified number of independent experimental units (e.g., individuals) and cells within the experimental unit. CONCLUSIONS: Hierarchicell is a user-friendly R-package that provides accurate estimates of power for testing hypotheses of differential expression in single-cell RNA-seq data. This R-package represents an important addition to single-cell RNA analytic tools and will help researchers design experiments with appropriate and accurate power, increasing discovery and improving robustness and reproducibility. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07635-w. BioMed Central 2021-05-01 /pmc/articles/PMC8088563/ /pubmed/33932993 http://dx.doi.org/10.1186/s12864-021-07635-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Zimmerman, Kip D.
Langefeld, Carl D.
Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data
title Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data
title_full Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data
title_fullStr Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data
title_full_unstemmed Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data
title_short Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data
title_sort hierarchicell: an r-package for estimating power for tests of differential expression with single-cell data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8088563/
https://www.ncbi.nlm.nih.gov/pubmed/33932993
http://dx.doi.org/10.1186/s12864-021-07635-w
work_keys_str_mv AT zimmermankipd hierarchicellanrpackageforestimatingpowerfortestsofdifferentialexpressionwithsinglecelldata
AT langefeldcarld hierarchicellanrpackageforestimatingpowerfortestsofdifferentialexpressionwithsinglecelldata