Cargando…

how_are_we_stranded_here: quick determination of RNA-Seq strandedness

BACKGROUND: Quality control checks are the first step in RNA-Sequencing analysis, which enable the identification of common issues that occur in the sequenced reads. Checks for sequence quality, contamination, and complexity are commonplace, and allow users to implement steps downstream which can ac...

Descripción completa

Detalles Bibliográficos
Autores principales: Signal, Brandon, Kahlke, Tim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8783475/
https://www.ncbi.nlm.nih.gov/pubmed/35065593
http://dx.doi.org/10.1186/s12859-022-04572-7
_version_ 1784638546964905984
author Signal, Brandon
Kahlke, Tim
author_facet Signal, Brandon
Kahlke, Tim
author_sort Signal, Brandon
collection PubMed
description BACKGROUND: Quality control checks are the first step in RNA-Sequencing analysis, which enable the identification of common issues that occur in the sequenced reads. Checks for sequence quality, contamination, and complexity are commonplace, and allow users to implement steps downstream which can account for these issues. Strand-specificity of reads is frequently overlooked and is often unavailable even in published data, yet when unknown or incorrectly specified can have detrimental effects on the reproducibility and accuracy of downstream analyses. RESULTS: To address these issues, we developed how_are_we_stranded_here, a Python library that helps to quickly infer strandedness of paired-end RNA-Sequencing data. Testing on both simulated and real RNA-Sequencing reads showed that it correctly measures strandedness, and measures outside the normal range may indicate sample contamination. CONCLUSIONS: how_are_we_stranded_here is fast and user friendly, making it easy to implement in quality control pipelines prior to analysing RNA-Sequencing data. how_are_we_stranded_here is freely available at https://github.com/betsig/how_are_we_stranded_here. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04572-7.
format Online
Article
Text
id pubmed-8783475
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-87834752022-01-24 how_are_we_stranded_here: quick determination of RNA-Seq strandedness Signal, Brandon Kahlke, Tim BMC Bioinformatics Software BACKGROUND: Quality control checks are the first step in RNA-Sequencing analysis, which enable the identification of common issues that occur in the sequenced reads. Checks for sequence quality, contamination, and complexity are commonplace, and allow users to implement steps downstream which can account for these issues. Strand-specificity of reads is frequently overlooked and is often unavailable even in published data, yet when unknown or incorrectly specified can have detrimental effects on the reproducibility and accuracy of downstream analyses. RESULTS: To address these issues, we developed how_are_we_stranded_here, a Python library that helps to quickly infer strandedness of paired-end RNA-Sequencing data. Testing on both simulated and real RNA-Sequencing reads showed that it correctly measures strandedness, and measures outside the normal range may indicate sample contamination. CONCLUSIONS: how_are_we_stranded_here is fast and user friendly, making it easy to implement in quality control pipelines prior to analysing RNA-Sequencing data. how_are_we_stranded_here is freely available at https://github.com/betsig/how_are_we_stranded_here. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04572-7. BioMed Central 2022-01-22 /pmc/articles/PMC8783475/ /pubmed/35065593 http://dx.doi.org/10.1186/s12859-022-04572-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Signal, Brandon
Kahlke, Tim
how_are_we_stranded_here: quick determination of RNA-Seq strandedness
title how_are_we_stranded_here: quick determination of RNA-Seq strandedness
title_full how_are_we_stranded_here: quick determination of RNA-Seq strandedness
title_fullStr how_are_we_stranded_here: quick determination of RNA-Seq strandedness
title_full_unstemmed how_are_we_stranded_here: quick determination of RNA-Seq strandedness
title_short how_are_we_stranded_here: quick determination of RNA-Seq strandedness
title_sort how_are_we_stranded_here: quick determination of rna-seq strandedness
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8783475/
https://www.ncbi.nlm.nih.gov/pubmed/35065593
http://dx.doi.org/10.1186/s12859-022-04572-7
work_keys_str_mv AT signalbrandon howarewestrandedherequickdeterminationofrnaseqstrandedness
AT kahlketim howarewestrandedherequickdeterminationofrnaseqstrandedness