Cargando…

Evaluation of tools for long read RNA-seq splice-aware alignment

MOTIVATION: High-throughput sequencing has transformed the study of gene expression levels through RNA-seq, a technique that is now routinely used by various fields, such as genetic research or diagnostics. The advent of third generation sequencing technologies providing significantly longer reads o...

Descripción completa

Detalles Bibliográficos
Autores principales: Križanović, Krešimir, Echchiki, Amina, Roux, Julien, Šikić, Mile
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6192213/
https://www.ncbi.nlm.nih.gov/pubmed/29069314
http://dx.doi.org/10.1093/bioinformatics/btx668
_version_ 1783363867990032384
author Križanović, Krešimir
Echchiki, Amina
Roux, Julien
Šikić, Mile
author_facet Križanović, Krešimir
Echchiki, Amina
Roux, Julien
Šikić, Mile
author_sort Križanović, Krešimir
collection PubMed
description MOTIVATION: High-throughput sequencing has transformed the study of gene expression levels through RNA-seq, a technique that is now routinely used by various fields, such as genetic research or diagnostics. The advent of third generation sequencing technologies providing significantly longer reads opens up new possibilities. However, the high error rates common to these technologies set new bioinformatics challenges for the gapped alignment of reads to their genomic origin. In this study, we have explored how currently available RNA-seq splice-aware alignment tools cope with increased read lengths and error rates. All tested tools were initially developed for short NGS reads, but some have claimed support for long Pacific Biosciences (PacBio) or even Oxford Nanopore Technologies (ONT) MinION reads. RESULTS: The tools were tested on synthetic and real datasets from two technologies (PacBio and ONT MinION). Alignment quality and resource usage were compared across different aligners. The effect of error correction of long reads was explored, both using self-correction and correction with an external short reads dataset. A tool was developed for evaluating RNA-seq alignment results. This tool can be used to compare the alignment of simulated reads to their genomic origin, or to compare the alignment of real reads to a set of annotated transcripts. Our tests show that while some RNA-seq aligners were unable to cope with long error-prone reads, others produced overall good results. We further show that alignment accuracy can be improved using error-corrected reads. AVAILABILITY AND IMPLEMENTATION: https://github.com/kkrizanovic/RNAseqEval, https://figshare.com/projects/RNAseq_benchmark/24391 SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6192213
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-61922132019-03-01 Evaluation of tools for long read RNA-seq splice-aware alignment Križanović, Krešimir Echchiki, Amina Roux, Julien Šikić, Mile Bioinformatics Original Papers MOTIVATION: High-throughput sequencing has transformed the study of gene expression levels through RNA-seq, a technique that is now routinely used by various fields, such as genetic research or diagnostics. The advent of third generation sequencing technologies providing significantly longer reads opens up new possibilities. However, the high error rates common to these technologies set new bioinformatics challenges for the gapped alignment of reads to their genomic origin. In this study, we have explored how currently available RNA-seq splice-aware alignment tools cope with increased read lengths and error rates. All tested tools were initially developed for short NGS reads, but some have claimed support for long Pacific Biosciences (PacBio) or even Oxford Nanopore Technologies (ONT) MinION reads. RESULTS: The tools were tested on synthetic and real datasets from two technologies (PacBio and ONT MinION). Alignment quality and resource usage were compared across different aligners. The effect of error correction of long reads was explored, both using self-correction and correction with an external short reads dataset. A tool was developed for evaluating RNA-seq alignment results. This tool can be used to compare the alignment of simulated reads to their genomic origin, or to compare the alignment of real reads to a set of annotated transcripts. Our tests show that while some RNA-seq aligners were unable to cope with long error-prone reads, others produced overall good results. We further show that alignment accuracy can be improved using error-corrected reads. AVAILABILITY AND IMPLEMENTATION: https://github.com/kkrizanovic/RNAseqEval, https://figshare.com/projects/RNAseq_benchmark/24391 SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-03-01 2017-10-23 /pmc/articles/PMC6192213/ /pubmed/29069314 http://dx.doi.org/10.1093/bioinformatics/btx668 Text en © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com https://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Križanović, Krešimir
Echchiki, Amina
Roux, Julien
Šikić, Mile
Evaluation of tools for long read RNA-seq splice-aware alignment
title Evaluation of tools for long read RNA-seq splice-aware alignment
title_full Evaluation of tools for long read RNA-seq splice-aware alignment
title_fullStr Evaluation of tools for long read RNA-seq splice-aware alignment
title_full_unstemmed Evaluation of tools for long read RNA-seq splice-aware alignment
title_short Evaluation of tools for long read RNA-seq splice-aware alignment
title_sort evaluation of tools for long read rna-seq splice-aware alignment
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6192213/
https://www.ncbi.nlm.nih.gov/pubmed/29069314
http://dx.doi.org/10.1093/bioinformatics/btx668
work_keys_str_mv AT krizanovickresimir evaluationoftoolsforlongreadrnaseqspliceawarealignment
AT echchikiamina evaluationoftoolsforlongreadrnaseqspliceawarealignment
AT rouxjulien evaluationoftoolsforlongreadrnaseqspliceawarealignment
AT sikicmile evaluationoftoolsforlongreadrnaseqspliceawarealignment