Cargando…

A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq

Transcriptome sequencing (RNA-Seq) overcomes limitations of previously used RNA quantification methods and provides one experimental framework for both high-throughput characterization and quantification of transcripts at the nucleotide level. The first step and a major challenge in the analysis of...

Descripción completa

Detalles Bibliográficos
Autores principales: Lindner, Robert, Friedel, Caroline C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530550/
https://www.ncbi.nlm.nih.gov/pubmed/23300661
http://dx.doi.org/10.1371/journal.pone.0052403
_version_ 1782254028487393280
author Lindner, Robert
Friedel, Caroline C.
author_facet Lindner, Robert
Friedel, Caroline C.
author_sort Lindner, Robert
collection PubMed
description Transcriptome sequencing (RNA-Seq) overcomes limitations of previously used RNA quantification methods and provides one experimental framework for both high-throughput characterization and quantification of transcripts at the nucleotide level. The first step and a major challenge in the analysis of such experiments is the mapping of sequencing reads to a transcriptomic origin including the identification of splicing events. In recent years, a large number of such mapping algorithms have been developed, all of which have in common that they require algorithms for aligning a vast number of reads to genomic or transcriptomic sequences. Although the FM-index based aligner Bowtie has become a de facto standard within mapping pipelines, a much larger number of possible alignment algorithms have been developed also including other variants of FM-index based aligners. Accordingly, developers and users of RNA-seq mapping pipelines have the choice among a large number of available alignment algorithms. To provide guidance in the choice of alignment algorithms for these purposes, we evaluated the performance of 14 widely used alignment programs from three different algorithmic classes: algorithms using either hashing of the reference transcriptome, hashing of reads, or a compressed FM-index representation of the genome. Here, special emphasis was placed on both precision and recall and the performance for different read lengths and numbers of mismatches and indels in a read. Our results clearly showed the significant reduction in memory footprint and runtime provided by FM-index based aligners at a precision and recall comparable to the best hash table based aligners. Furthermore, the recently developed Bowtie 2 alignment algorithm shows a remarkable tolerance to both sequencing errors and indels, thus, essentially making hash-based aligners obsolete.
format Online
Article
Text
id pubmed-3530550
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35305502013-01-08 A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq Lindner, Robert Friedel, Caroline C. PLoS One Research Article Transcriptome sequencing (RNA-Seq) overcomes limitations of previously used RNA quantification methods and provides one experimental framework for both high-throughput characterization and quantification of transcripts at the nucleotide level. The first step and a major challenge in the analysis of such experiments is the mapping of sequencing reads to a transcriptomic origin including the identification of splicing events. In recent years, a large number of such mapping algorithms have been developed, all of which have in common that they require algorithms for aligning a vast number of reads to genomic or transcriptomic sequences. Although the FM-index based aligner Bowtie has become a de facto standard within mapping pipelines, a much larger number of possible alignment algorithms have been developed also including other variants of FM-index based aligners. Accordingly, developers and users of RNA-seq mapping pipelines have the choice among a large number of available alignment algorithms. To provide guidance in the choice of alignment algorithms for these purposes, we evaluated the performance of 14 widely used alignment programs from three different algorithmic classes: algorithms using either hashing of the reference transcriptome, hashing of reads, or a compressed FM-index representation of the genome. Here, special emphasis was placed on both precision and recall and the performance for different read lengths and numbers of mismatches and indels in a read. Our results clearly showed the significant reduction in memory footprint and runtime provided by FM-index based aligners at a precision and recall comparable to the best hash table based aligners. Furthermore, the recently developed Bowtie 2 alignment algorithm shows a remarkable tolerance to both sequencing errors and indels, thus, essentially making hash-based aligners obsolete. Public Library of Science 2012-12-26 /pmc/articles/PMC3530550/ /pubmed/23300661 http://dx.doi.org/10.1371/journal.pone.0052403 Text en © 2012 Lindner, Friedel http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lindner, Robert
Friedel, Caroline C.
A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq
title A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq
title_full A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq
title_fullStr A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq
title_full_unstemmed A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq
title_short A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq
title_sort comprehensive evaluation of alignment algorithms in the context of rna-seq
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530550/
https://www.ncbi.nlm.nih.gov/pubmed/23300661
http://dx.doi.org/10.1371/journal.pone.0052403
work_keys_str_mv AT lindnerrobert acomprehensiveevaluationofalignmentalgorithmsinthecontextofrnaseq
AT friedelcarolinec acomprehensiveevaluationofalignmentalgorithmsinthecontextofrnaseq
AT lindnerrobert comprehensiveevaluationofalignmentalgorithmsinthecontextofrnaseq
AT friedelcarolinec comprehensiveevaluationofalignmentalgorithmsinthecontextofrnaseq