Cargando…
RNASequel: accurate and repeat tolerant realignment of RNA-seq reads
RNA-seq is a key technology for understanding the biology of the cell because of its ability to profile transcriptional and post-transcriptional regulation at single nucleotide resolutions. Compared to DNA sequencing alignment algorithms, RNA-seq alignment algorithms have a diminished ability to acc...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4605292/ https://www.ncbi.nlm.nih.gov/pubmed/26082497 http://dx.doi.org/10.1093/nar/gkv594 |
_version_ | 1782395186648711168 |
---|---|
author | Wilson, Gavin W. Stein, Lincoln D. |
author_facet | Wilson, Gavin W. Stein, Lincoln D. |
author_sort | Wilson, Gavin W. |
collection | PubMed |
description | RNA-seq is a key technology for understanding the biology of the cell because of its ability to profile transcriptional and post-transcriptional regulation at single nucleotide resolutions. Compared to DNA sequencing alignment algorithms, RNA-seq alignment algorithms have a diminished ability to accurately detect and map base pair substitutions, gaps, discordant pairs and repetitive regions. These shortcomings adversely affect experiments that require a high degree of accuracy, notably the ability to detect RNA editing. We have developed RNASequel, a software package that runs as a post-processing step in conjunction with an RNA-seq aligner and systematically corrects common alignment artifacts. Its key innovations are a two-pass splice junction alignment system that includes de novo splice junctions and the use of an empirically determined estimate of the fragment size distribution when resolving read pairs. We demonstrate that RNASequel produces improved alignments when used in conjunction with STAR or Tophat2 using two simulated datasets. We then show that RNASequel improves the identification of adenosine to inosine RNA editing sites on biological datasets. This software will be useful in applications requiring the accurate identification of variants in RNA sequencing data, the discovery of RNA editing sites and the analysis of alternative splicing. |
format | Online Article Text |
id | pubmed-4605292 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-46052922015-10-19 RNASequel: accurate and repeat tolerant realignment of RNA-seq reads Wilson, Gavin W. Stein, Lincoln D. Nucleic Acids Res Methods Online RNA-seq is a key technology for understanding the biology of the cell because of its ability to profile transcriptional and post-transcriptional regulation at single nucleotide resolutions. Compared to DNA sequencing alignment algorithms, RNA-seq alignment algorithms have a diminished ability to accurately detect and map base pair substitutions, gaps, discordant pairs and repetitive regions. These shortcomings adversely affect experiments that require a high degree of accuracy, notably the ability to detect RNA editing. We have developed RNASequel, a software package that runs as a post-processing step in conjunction with an RNA-seq aligner and systematically corrects common alignment artifacts. Its key innovations are a two-pass splice junction alignment system that includes de novo splice junctions and the use of an empirically determined estimate of the fragment size distribution when resolving read pairs. We demonstrate that RNASequel produces improved alignments when used in conjunction with STAR or Tophat2 using two simulated datasets. We then show that RNASequel improves the identification of adenosine to inosine RNA editing sites on biological datasets. This software will be useful in applications requiring the accurate identification of variants in RNA sequencing data, the discovery of RNA editing sites and the analysis of alternative splicing. Oxford University Press 2015-10-15 2015-10-10 /pmc/articles/PMC4605292/ /pubmed/26082497 http://dx.doi.org/10.1093/nar/gkv594 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Wilson, Gavin W. Stein, Lincoln D. RNASequel: accurate and repeat tolerant realignment of RNA-seq reads |
title | RNASequel: accurate and repeat tolerant realignment of RNA-seq reads |
title_full | RNASequel: accurate and repeat tolerant realignment of RNA-seq reads |
title_fullStr | RNASequel: accurate and repeat tolerant realignment of RNA-seq reads |
title_full_unstemmed | RNASequel: accurate and repeat tolerant realignment of RNA-seq reads |
title_short | RNASequel: accurate and repeat tolerant realignment of RNA-seq reads |
title_sort | rnasequel: accurate and repeat tolerant realignment of rna-seq reads |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4605292/ https://www.ncbi.nlm.nih.gov/pubmed/26082497 http://dx.doi.org/10.1093/nar/gkv594 |
work_keys_str_mv | AT wilsongavinw rnasequelaccurateandrepeattolerantrealignmentofrnaseqreads AT steinlincolnd rnasequelaccurateandrepeattolerantrealignmentofrnaseqreads |