Cargando…
Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
BACKGROUND: Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and estimate...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7236949/ https://www.ncbi.nlm.nih.gov/pubmed/32429934 http://dx.doi.org/10.1186/s12859-020-3541-7 |
_version_ | 1783536237686030336 |
---|---|
author | Yu, Lianbo Fernandez, Soledad Brock, Guy |
author_facet | Yu, Lianbo Fernandez, Soledad Brock, Guy |
author_sort | Yu, Lianbo |
collection | PubMed |
description | BACKGROUND: Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and estimate power for RNA-Seq differential expression studies using such designs. To fill the gap, simulation based methods have a great advantage by providing numerical solutions, since theoretical distributions of test statistics are typically unavailable for such designs. RESULTS: In this paper, we propose a novel simulation based procedure for power estimation of differential expression with the employment of generalized linear mixed effects models for correlated expression data. We also propose a new procedure for power estimation of differential expression with the use of a bivariate negative binomial distribution for paired designs. We compare the performance of both the likelihood ratio test and Wald test under a variety of simulation scenarios with the proposed procedures. The simulated distribution was used to estimate the null distribution of test statistics in order to achieve the desired false positive control and was compared to the asymptotic Chi-square distribution. In addition, we applied the procedure for paired designs to the TCGA breast cancer data set. CONCLUSIONS: In summary, we provide a framework for power estimation of RNA-Seq differential expression under complex experimental designs. Simulation results demonstrate that both the proposed procedures properly control the false positive rate at the nominal level. |
format | Online Article Text |
id | pubmed-7236949 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-72369492020-05-27 Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models Yu, Lianbo Fernandez, Soledad Brock, Guy BMC Bioinformatics Methodology Article BACKGROUND: Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and estimate power for RNA-Seq differential expression studies using such designs. To fill the gap, simulation based methods have a great advantage by providing numerical solutions, since theoretical distributions of test statistics are typically unavailable for such designs. RESULTS: In this paper, we propose a novel simulation based procedure for power estimation of differential expression with the employment of generalized linear mixed effects models for correlated expression data. We also propose a new procedure for power estimation of differential expression with the use of a bivariate negative binomial distribution for paired designs. We compare the performance of both the likelihood ratio test and Wald test under a variety of simulation scenarios with the proposed procedures. The simulated distribution was used to estimate the null distribution of test statistics in order to achieve the desired false positive control and was compared to the asymptotic Chi-square distribution. In addition, we applied the procedure for paired designs to the TCGA breast cancer data set. CONCLUSIONS: In summary, we provide a framework for power estimation of RNA-Seq differential expression under complex experimental designs. Simulation results demonstrate that both the proposed procedures properly control the false positive rate at the nominal level. BioMed Central 2020-05-19 /pmc/articles/PMC7236949/ /pubmed/32429934 http://dx.doi.org/10.1186/s12859-020-3541-7 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Yu, Lianbo Fernandez, Soledad Brock, Guy Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_full | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_fullStr | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_full_unstemmed | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_short | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_sort | power analysis for rna-seq differential expression studies using generalized linear mixed effects models |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7236949/ https://www.ncbi.nlm.nih.gov/pubmed/32429934 http://dx.doi.org/10.1186/s12859-020-3541-7 |
work_keys_str_mv | AT yulianbo poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels AT fernandezsoledad poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels AT brockguy poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels |