Cargando…

Accuracy of RNA-Seq and its dependence on sequencing depth

BACKGROUND: The cost of DNA sequencing has undergone a dramatical reduction in the past decade. As a result, sequencing technologies have been increasingly applied to genomic research. RNA-Seq is becoming a common technique for surveying gene expression based on DNA sequencing. As it is not clear ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Guoshuai, Li, Hua, Lu, Yue, Huang, Xuelin, Lee, Juhee, Müller, Peter, Ji, Yuan, Liang, Shoudan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426807/
https://www.ncbi.nlm.nih.gov/pubmed/23320920
http://dx.doi.org/10.1186/1471-2105-13-S13-S5
_version_ 1782241546691674112
author Cai, Guoshuai
Li, Hua
Lu, Yue
Huang, Xuelin
Lee, Juhee
Müller, Peter
Ji, Yuan
Liang, Shoudan
author_facet Cai, Guoshuai
Li, Hua
Lu, Yue
Huang, Xuelin
Lee, Juhee
Müller, Peter
Ji, Yuan
Liang, Shoudan
author_sort Cai, Guoshuai
collection PubMed
description BACKGROUND: The cost of DNA sequencing has undergone a dramatical reduction in the past decade. As a result, sequencing technologies have been increasingly applied to genomic research. RNA-Seq is becoming a common technique for surveying gene expression based on DNA sequencing. As it is not clear how increased sequencing capacity has affected measurement accuracy of mRNA, we sought to investigate that relationship. RESULT: We empirically evaluate the accuracy of repeated gene expression measurements using RNA-Seq. We identify library preparation steps prior to DNA sequencing as the main source of error in this process. Studying three datasets, we show that the accuracy indeed improves with the sequencing depth. However, the rate of improvement as a function of sequence reads is generally slower than predicted by the binomial distribution. We therefore used the beta-binomial distribution to model the overdispersion. The overdispersion parameters we introduced depend explicitly on the number of reads so that the resulting statistical uncertainty is consistent with the empirical data that measurement accuracy increases with the sequencing depth. The overdispersion parameters were determined by maximizing the likelihood. We shown that our modified beta-binomial model had lower false discovery rate than the binomial or the pure beta-binomial models. CONCLUSION: We proposed a novel form of overdispersion guaranteeing that the accuracy improves with sequencing depth. We demonstrated that the new form provides a better fit to the data.
format Online
Article
Text
id pubmed-3426807
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34268072012-08-24 Accuracy of RNA-Seq and its dependence on sequencing depth Cai, Guoshuai Li, Hua Lu, Yue Huang, Xuelin Lee, Juhee Müller, Peter Ji, Yuan Liang, Shoudan BMC Bioinformatics Research BACKGROUND: The cost of DNA sequencing has undergone a dramatical reduction in the past decade. As a result, sequencing technologies have been increasingly applied to genomic research. RNA-Seq is becoming a common technique for surveying gene expression based on DNA sequencing. As it is not clear how increased sequencing capacity has affected measurement accuracy of mRNA, we sought to investigate that relationship. RESULT: We empirically evaluate the accuracy of repeated gene expression measurements using RNA-Seq. We identify library preparation steps prior to DNA sequencing as the main source of error in this process. Studying three datasets, we show that the accuracy indeed improves with the sequencing depth. However, the rate of improvement as a function of sequence reads is generally slower than predicted by the binomial distribution. We therefore used the beta-binomial distribution to model the overdispersion. The overdispersion parameters we introduced depend explicitly on the number of reads so that the resulting statistical uncertainty is consistent with the empirical data that measurement accuracy increases with the sequencing depth. The overdispersion parameters were determined by maximizing the likelihood. We shown that our modified beta-binomial model had lower false discovery rate than the binomial or the pure beta-binomial models. CONCLUSION: We proposed a novel form of overdispersion guaranteeing that the accuracy improves with sequencing depth. We demonstrated that the new form provides a better fit to the data. BioMed Central 2012-08-24 /pmc/articles/PMC3426807/ /pubmed/23320920 http://dx.doi.org/10.1186/1471-2105-13-S13-S5 Text en Copyright ©2012 Cai et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Cai, Guoshuai
Li, Hua
Lu, Yue
Huang, Xuelin
Lee, Juhee
Müller, Peter
Ji, Yuan
Liang, Shoudan
Accuracy of RNA-Seq and its dependence on sequencing depth
title Accuracy of RNA-Seq and its dependence on sequencing depth
title_full Accuracy of RNA-Seq and its dependence on sequencing depth
title_fullStr Accuracy of RNA-Seq and its dependence on sequencing depth
title_full_unstemmed Accuracy of RNA-Seq and its dependence on sequencing depth
title_short Accuracy of RNA-Seq and its dependence on sequencing depth
title_sort accuracy of rna-seq and its dependence on sequencing depth
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426807/
https://www.ncbi.nlm.nih.gov/pubmed/23320920
http://dx.doi.org/10.1186/1471-2105-13-S13-S5
work_keys_str_mv AT caiguoshuai accuracyofrnaseqanditsdependenceonsequencingdepth
AT lihua accuracyofrnaseqanditsdependenceonsequencingdepth
AT luyue accuracyofrnaseqanditsdependenceonsequencingdepth
AT huangxuelin accuracyofrnaseqanditsdependenceonsequencingdepth
AT leejuhee accuracyofrnaseqanditsdependenceonsequencingdepth
AT mullerpeter accuracyofrnaseqanditsdependenceonsequencingdepth
AT jiyuan accuracyofrnaseqanditsdependenceonsequencingdepth
AT liangshoudan accuracyofrnaseqanditsdependenceonsequencingdepth