Cargando…

Dispersion Estimation and Its Effect on Test Performance in RNA-seq Data Analysis: A Simulation-Based Comparison of Methods

A central goal of RNA sequencing (RNA-seq) experiments is to detect differentially expressed genes. In the ubiquitous negative binomial model for RNA-seq data, each gene is given a dispersion parameter, and correctly estimating these dispersion parameters is vital to detecting differential expressio...

Descripción completa

Detalles Bibliográficos
Autores principales: Landau, William Michael, Liu, Peng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3857202/
https://www.ncbi.nlm.nih.gov/pubmed/24349066
http://dx.doi.org/10.1371/journal.pone.0081415
_version_ 1782295127337730048
author Landau, William Michael
Liu, Peng
author_facet Landau, William Michael
Liu, Peng
author_sort Landau, William Michael
collection PubMed
description A central goal of RNA sequencing (RNA-seq) experiments is to detect differentially expressed genes. In the ubiquitous negative binomial model for RNA-seq data, each gene is given a dispersion parameter, and correctly estimating these dispersion parameters is vital to detecting differential expression. Since the dispersions control the variances of the gene counts, underestimation may lead to false discovery, while overestimation may lower the rate of true detection. After briefly reviewing several popular dispersion estimation methods, this article describes a simulation study that compares them in terms of point estimation and the effect on the performance of tests for differential expression. The methods that maximize the test performance are the ones that use a moderate degree of dispersion shrinkage: the DSS, Tagwise wqCML, and Tagwise APL. In practical RNA-seq data analysis, we recommend using one of these moderate-shrinkage methods with the QLShrink test in QuasiSeq R package.
format Online
Article
Text
id pubmed-3857202
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38572022013-12-13 Dispersion Estimation and Its Effect on Test Performance in RNA-seq Data Analysis: A Simulation-Based Comparison of Methods Landau, William Michael Liu, Peng PLoS One Research Article A central goal of RNA sequencing (RNA-seq) experiments is to detect differentially expressed genes. In the ubiquitous negative binomial model for RNA-seq data, each gene is given a dispersion parameter, and correctly estimating these dispersion parameters is vital to detecting differential expression. Since the dispersions control the variances of the gene counts, underestimation may lead to false discovery, while overestimation may lower the rate of true detection. After briefly reviewing several popular dispersion estimation methods, this article describes a simulation study that compares them in terms of point estimation and the effect on the performance of tests for differential expression. The methods that maximize the test performance are the ones that use a moderate degree of dispersion shrinkage: the DSS, Tagwise wqCML, and Tagwise APL. In practical RNA-seq data analysis, we recommend using one of these moderate-shrinkage methods with the QLShrink test in QuasiSeq R package. Public Library of Science 2013-12-09 /pmc/articles/PMC3857202/ /pubmed/24349066 http://dx.doi.org/10.1371/journal.pone.0081415 Text en © 2013 Landau, Liu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Landau, William Michael
Liu, Peng
Dispersion Estimation and Its Effect on Test Performance in RNA-seq Data Analysis: A Simulation-Based Comparison of Methods
title Dispersion Estimation and Its Effect on Test Performance in RNA-seq Data Analysis: A Simulation-Based Comparison of Methods
title_full Dispersion Estimation and Its Effect on Test Performance in RNA-seq Data Analysis: A Simulation-Based Comparison of Methods
title_fullStr Dispersion Estimation and Its Effect on Test Performance in RNA-seq Data Analysis: A Simulation-Based Comparison of Methods
title_full_unstemmed Dispersion Estimation and Its Effect on Test Performance in RNA-seq Data Analysis: A Simulation-Based Comparison of Methods
title_short Dispersion Estimation and Its Effect on Test Performance in RNA-seq Data Analysis: A Simulation-Based Comparison of Methods
title_sort dispersion estimation and its effect on test performance in rna-seq data analysis: a simulation-based comparison of methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3857202/
https://www.ncbi.nlm.nih.gov/pubmed/24349066
http://dx.doi.org/10.1371/journal.pone.0081415
work_keys_str_mv AT landauwilliammichael dispersionestimationanditseffectontestperformanceinrnaseqdataanalysisasimulationbasedcomparisonofmethods
AT liupeng dispersionestimationanditseffectontestperformanceinrnaseqdataanalysisasimulationbasedcomparisonofmethods