Cargando…

dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data

BACKGROUND: PCR clonal artefacts originating from NGS library preparation can affect both genomic as well as RNA-Seq applications when protocols are pushed to their limits. In RNA-Seq however the artifactual reads are not easy to tell apart from normal read duplication due to natural over-sequencing...

Descripción completa

Detalles Bibliográficos
Autores principales: Sayols, Sergi, Scherzinger, Denise, Klein, Holger
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5073875/
https://www.ncbi.nlm.nih.gov/pubmed/27769170
http://dx.doi.org/10.1186/s12859-016-1276-2
_version_ 1782461647651078144
author Sayols, Sergi
Scherzinger, Denise
Klein, Holger
author_facet Sayols, Sergi
Scherzinger, Denise
Klein, Holger
author_sort Sayols, Sergi
collection PubMed
description BACKGROUND: PCR clonal artefacts originating from NGS library preparation can affect both genomic as well as RNA-Seq applications when protocols are pushed to their limits. In RNA-Seq however the artifactual reads are not easy to tell apart from normal read duplication due to natural over-sequencing of highly expressed genes. Especially when working with little input material or single cells assessing the fraction of duplicate reads is an important quality control step for NGS data sets. Up to now there are only tools to calculate the global duplication rates that do not take into account the effect of gene expression levels which leaves them of limited use for RNA-Seq data. RESULTS: Here we present the tool dupRadar, which provides an easy means to distinguish the fraction of reads originating in natural duplication due to high expression from the fraction induced by artefacts. dupRadar assesses the fraction of duplicate reads per gene dependent on the expression level. Apart from the Bioconductor package dupRadar we provide shell scripts for easy integration into processing pipelines. CONCLUSIONS: The Bioconductor package dupRadar offers straight-forward methods to assess RNA-Seq datasets for quality issues with PCR duplicates. It is aimed towards simple integration into standard analysis pipelines as a default QC metric that is especially useful for low-input and single cell RNA-Seq data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1276-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5073875
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50738752016-10-26 dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data Sayols, Sergi Scherzinger, Denise Klein, Holger BMC Bioinformatics Software BACKGROUND: PCR clonal artefacts originating from NGS library preparation can affect both genomic as well as RNA-Seq applications when protocols are pushed to their limits. In RNA-Seq however the artifactual reads are not easy to tell apart from normal read duplication due to natural over-sequencing of highly expressed genes. Especially when working with little input material or single cells assessing the fraction of duplicate reads is an important quality control step for NGS data sets. Up to now there are only tools to calculate the global duplication rates that do not take into account the effect of gene expression levels which leaves them of limited use for RNA-Seq data. RESULTS: Here we present the tool dupRadar, which provides an easy means to distinguish the fraction of reads originating in natural duplication due to high expression from the fraction induced by artefacts. dupRadar assesses the fraction of duplicate reads per gene dependent on the expression level. Apart from the Bioconductor package dupRadar we provide shell scripts for easy integration into processing pipelines. CONCLUSIONS: The Bioconductor package dupRadar offers straight-forward methods to assess RNA-Seq datasets for quality issues with PCR duplicates. It is aimed towards simple integration into standard analysis pipelines as a default QC metric that is especially useful for low-input and single cell RNA-Seq data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1276-2) contains supplementary material, which is available to authorized users. BioMed Central 2016-10-21 /pmc/articles/PMC5073875/ /pubmed/27769170 http://dx.doi.org/10.1186/s12859-016-1276-2 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Sayols, Sergi
Scherzinger, Denise
Klein, Holger
dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data
title dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data
title_full dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data
title_fullStr dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data
title_full_unstemmed dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data
title_short dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data
title_sort dupradar: a bioconductor package for the assessment of pcr artifacts in rna-seq data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5073875/
https://www.ncbi.nlm.nih.gov/pubmed/27769170
http://dx.doi.org/10.1186/s12859-016-1276-2
work_keys_str_mv AT sayolssergi dupradarabioconductorpackagefortheassessmentofpcrartifactsinrnaseqdata
AT scherzingerdenise dupradarabioconductorpackagefortheassessmentofpcrartifactsinrnaseqdata
AT kleinholger dupradarabioconductorpackagefortheassessmentofpcrartifactsinrnaseqdata