Cargando…

Assessment of the Impact of Using a Reference Transcriptome in Mapping Short RNA-Seq Reads

RNA-Seq has become increasingly popular in transcriptome profiling. The major challenge in RNA-Seq data analysis is the accurate mapping of junction reads to their genomic origins. To detect splicing sites in short reads, many RNA-Seq aligners use reference transcriptome to inform placement of junct...

Descripción completa

Detalles Bibliográficos
Autor principal: Zhao, Shanrong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4081564/
https://www.ncbi.nlm.nih.gov/pubmed/24992027
http://dx.doi.org/10.1371/journal.pone.0101374
_version_ 1782324122072645632
author Zhao, Shanrong
author_facet Zhao, Shanrong
author_sort Zhao, Shanrong
collection PubMed
description RNA-Seq has become increasingly popular in transcriptome profiling. The major challenge in RNA-Seq data analysis is the accurate mapping of junction reads to their genomic origins. To detect splicing sites in short reads, many RNA-Seq aligners use reference transcriptome to inform placement of junction reads. However, no systematic evaluation has been performed to assess or quantify the benefits of incorporating reference transcriptome in mapping RNA-Seq reads. In this paper, we have studied the impact of reference transcriptome on mapping RNA-Seq reads, especially on junction ones. The same dataset were analysed with and without RefGene transcriptome, respectively. Then a Perl script was developed to analyse and compare the mapping results. It was found that about 50–55% junction reads can be mapped to the same genomic regions regardless of the usage of RefGene model. More than one-third of reads fail to be mapped without the help of a reference transcriptome. For “Alternatively” mapped reads, i.e., those reads mapped differently with and without RefGene model, the mappings without RefGene model are usually worse than their corresponding alignments with RefGene model. For junction reads that span more than two exons, it is less likely to align them correctly without the assistance of reference transcriptome. As the sequencing technology evolves, the read length is becoming longer and longer. When reads become longer, they are more likely to span multiple exons, and thus the mapping of long junction reads is actually becoming more and more challenging without the assistance of reference transcriptome. Therefore, the advantages of using reference transcriptome in the mapping demonstrated in this study are becoming more evident for longer reads. In addition, the effect of the completeness of reference transcriptome on mapping of RNA-Seq reads is discussed.
format Online
Article
Text
id pubmed-4081564
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-40815642014-07-10 Assessment of the Impact of Using a Reference Transcriptome in Mapping Short RNA-Seq Reads Zhao, Shanrong PLoS One Research Article RNA-Seq has become increasingly popular in transcriptome profiling. The major challenge in RNA-Seq data analysis is the accurate mapping of junction reads to their genomic origins. To detect splicing sites in short reads, many RNA-Seq aligners use reference transcriptome to inform placement of junction reads. However, no systematic evaluation has been performed to assess or quantify the benefits of incorporating reference transcriptome in mapping RNA-Seq reads. In this paper, we have studied the impact of reference transcriptome on mapping RNA-Seq reads, especially on junction ones. The same dataset were analysed with and without RefGene transcriptome, respectively. Then a Perl script was developed to analyse and compare the mapping results. It was found that about 50–55% junction reads can be mapped to the same genomic regions regardless of the usage of RefGene model. More than one-third of reads fail to be mapped without the help of a reference transcriptome. For “Alternatively” mapped reads, i.e., those reads mapped differently with and without RefGene model, the mappings without RefGene model are usually worse than their corresponding alignments with RefGene model. For junction reads that span more than two exons, it is less likely to align them correctly without the assistance of reference transcriptome. As the sequencing technology evolves, the read length is becoming longer and longer. When reads become longer, they are more likely to span multiple exons, and thus the mapping of long junction reads is actually becoming more and more challenging without the assistance of reference transcriptome. Therefore, the advantages of using reference transcriptome in the mapping demonstrated in this study are becoming more evident for longer reads. In addition, the effect of the completeness of reference transcriptome on mapping of RNA-Seq reads is discussed. Public Library of Science 2014-07-03 /pmc/articles/PMC4081564/ /pubmed/24992027 http://dx.doi.org/10.1371/journal.pone.0101374 Text en © 2014 Shanrong Zhao http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zhao, Shanrong
Assessment of the Impact of Using a Reference Transcriptome in Mapping Short RNA-Seq Reads
title Assessment of the Impact of Using a Reference Transcriptome in Mapping Short RNA-Seq Reads
title_full Assessment of the Impact of Using a Reference Transcriptome in Mapping Short RNA-Seq Reads
title_fullStr Assessment of the Impact of Using a Reference Transcriptome in Mapping Short RNA-Seq Reads
title_full_unstemmed Assessment of the Impact of Using a Reference Transcriptome in Mapping Short RNA-Seq Reads
title_short Assessment of the Impact of Using a Reference Transcriptome in Mapping Short RNA-Seq Reads
title_sort assessment of the impact of using a reference transcriptome in mapping short rna-seq reads
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4081564/
https://www.ncbi.nlm.nih.gov/pubmed/24992027
http://dx.doi.org/10.1371/journal.pone.0101374
work_keys_str_mv AT zhaoshanrong assessmentoftheimpactofusingareferencetranscriptomeinmappingshortrnaseqreads