Cargando…

subSeq: Determining Appropriate Sequencing Depth Through Efficient Read Subsampling

Motivation: Next-generation sequencing experiments, such as RNA-Seq, play an increasingly important role in biological research. One complication is that the power and accuracy of such experiments depend substantially on the number of reads sequenced, so it is important and challenging to determine...

Descripción completa

Detalles Bibliográficos
Autores principales: Robinson, David G., Storey, John D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4296149/
https://www.ncbi.nlm.nih.gov/pubmed/25189781
http://dx.doi.org/10.1093/bioinformatics/btu552
_version_ 1782352934505283584
author Robinson, David G.
Storey, John D.
author_facet Robinson, David G.
Storey, John D.
author_sort Robinson, David G.
collection PubMed
description Motivation: Next-generation sequencing experiments, such as RNA-Seq, play an increasingly important role in biological research. One complication is that the power and accuracy of such experiments depend substantially on the number of reads sequenced, so it is important and challenging to determine the optimal read depth for an experiment or to verify whether one has adequate depth in an existing experiment. Results: By randomly sampling lower depths from a sequencing experiment and determining where the saturation of power and accuracy occurs, one can determine what the most useful depth should be for future experiments, and furthermore, confirm whether an existing experiment had sufficient depth to justify its conclusions. We introduce the subSeq R package, which uses a novel efficient approach to perform this subsampling and to calculate informative metrics at each depth. Availability and Implementation: The subSeq R package is available at http://github.com/StoreyLab/subSeq/. Contact: dgrtwo@princeton.edu or jstorey@princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4296149
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-42961492015-01-22 subSeq: Determining Appropriate Sequencing Depth Through Efficient Read Subsampling Robinson, David G. Storey, John D. Bioinformatics Applications Notes Motivation: Next-generation sequencing experiments, such as RNA-Seq, play an increasingly important role in biological research. One complication is that the power and accuracy of such experiments depend substantially on the number of reads sequenced, so it is important and challenging to determine the optimal read depth for an experiment or to verify whether one has adequate depth in an existing experiment. Results: By randomly sampling lower depths from a sequencing experiment and determining where the saturation of power and accuracy occurs, one can determine what the most useful depth should be for future experiments, and furthermore, confirm whether an existing experiment had sufficient depth to justify its conclusions. We introduce the subSeq R package, which uses a novel efficient approach to perform this subsampling and to calculate informative metrics at each depth. Availability and Implementation: The subSeq R package is available at http://github.com/StoreyLab/subSeq/. Contact: dgrtwo@princeton.edu or jstorey@princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-12-01 2014-09-03 /pmc/articles/PMC4296149/ /pubmed/25189781 http://dx.doi.org/10.1093/bioinformatics/btu552 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Robinson, David G.
Storey, John D.
subSeq: Determining Appropriate Sequencing Depth Through Efficient Read Subsampling
title subSeq: Determining Appropriate Sequencing Depth Through Efficient Read Subsampling
title_full subSeq: Determining Appropriate Sequencing Depth Through Efficient Read Subsampling
title_fullStr subSeq: Determining Appropriate Sequencing Depth Through Efficient Read Subsampling
title_full_unstemmed subSeq: Determining Appropriate Sequencing Depth Through Efficient Read Subsampling
title_short subSeq: Determining Appropriate Sequencing Depth Through Efficient Read Subsampling
title_sort subseq: determining appropriate sequencing depth through efficient read subsampling
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4296149/
https://www.ncbi.nlm.nih.gov/pubmed/25189781
http://dx.doi.org/10.1093/bioinformatics/btu552
work_keys_str_mv AT robinsondavidg subseqdeterminingappropriatesequencingdepththroughefficientreadsubsampling
AT storeyjohnd subseqdeterminingappropriatesequencingdepththroughefficientreadsubsampling