Cargando…

Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments

BACKGROUND: RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bi, Ran, Liu, Peng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4815167/ https://www.ncbi.nlm.nih.gov/pubmed/27029470 http://dx.doi.org/10.1186/s12859-016-0994-9

_version_	1782424553062924288
author	Bi, Ran Liu, Peng
author_facet	Bi, Ran Liu, Peng
author_sort	Bi, Ran
collection	PubMed
description	BACKGROUND: RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging in the context of differential expression analysis with RNA-seq data. One challenge is that there are no closed-form formulae to calculate power for the popularly applied tests for differential expression analysis. In addition, false discovery rate (FDR), instead of family-wise type I error rate, is controlled for the multiple testing error in RNA-seq data analysis. So far, there are very few proposals on sample size calculation for RNA-seq experiments. RESULTS: In this paper, we propose a procedure for sample size calculation while controlling FDR for RNA-seq experimental design. Our procedure is based on the weighted linear model analysis facilitated by the voom method which has been shown to have competitive performance in terms of power and FDR control for RNA-seq differential expression analysis. We derive a method that approximates the average power across the differentially expressed genes, and then calculate the sample size to achieve a desired average power while controlling FDR. Simulation results demonstrate that the actual power of several popularly applied tests for differential expression is achieved and is close to the desired power for RNA-seq data with sample size calculated based on our method. CONCLUSIONS: Our proposed method provides an efficient algorithm to calculate sample size while controlling FDR for RNA-seq experimental design. We also provide an R package ssizeRNA that implements our proposed method and can be downloaded from the Comprehensive R Archive Network (http://cran.r-project.org). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0994-9) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4815167
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-48151672016-04-01 Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments Bi, Ran Liu, Peng BMC Bioinformatics Methodology Article BACKGROUND: RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging in the context of differential expression analysis with RNA-seq data. One challenge is that there are no closed-form formulae to calculate power for the popularly applied tests for differential expression analysis. In addition, false discovery rate (FDR), instead of family-wise type I error rate, is controlled for the multiple testing error in RNA-seq data analysis. So far, there are very few proposals on sample size calculation for RNA-seq experiments. RESULTS: In this paper, we propose a procedure for sample size calculation while controlling FDR for RNA-seq experimental design. Our procedure is based on the weighted linear model analysis facilitated by the voom method which has been shown to have competitive performance in terms of power and FDR control for RNA-seq differential expression analysis. We derive a method that approximates the average power across the differentially expressed genes, and then calculate the sample size to achieve a desired average power while controlling FDR. Simulation results demonstrate that the actual power of several popularly applied tests for differential expression is achieved and is close to the desired power for RNA-seq data with sample size calculated based on our method. CONCLUSIONS: Our proposed method provides an efficient algorithm to calculate sample size while controlling FDR for RNA-seq experimental design. We also provide an R package ssizeRNA that implements our proposed method and can be downloaded from the Comprehensive R Archive Network (http://cran.r-project.org). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0994-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-03-31 /pmc/articles/PMC4815167/ /pubmed/27029470 http://dx.doi.org/10.1186/s12859-016-0994-9 Text en © Bi and Liu. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Bi, Ran Liu, Peng Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments
title	Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments
title_full	Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments
title_fullStr	Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments
title_full_unstemmed	Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments
title_short	Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments
title_sort	sample size calculation while controlling false discovery rate for differential expression analysis with rna-sequencing experiments
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4815167/ https://www.ncbi.nlm.nih.gov/pubmed/27029470 http://dx.doi.org/10.1186/s12859-016-0994-9
work_keys_str_mv	AT biran samplesizecalculationwhilecontrollingfalsediscoveryratefordifferentialexpressionanalysiswithrnasequencingexperiments AT liupeng samplesizecalculationwhilecontrollingfalsediscoveryratefordifferentialexpressionanalysiswithrnasequencingexperiments

Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments

Ejemplares similares