Cargando…

libgapmis: extending short-read alignments

BACKGROUND: A wide variety of short-read alignment programmes have been published recently to tackle the problem of mapping millions of short reads to a reference genome, focusing on different aspects of the procedure such as time and memory efficiency, sensitivity, and accuracy. These tools allow f...

Descripción completa

Detalles Bibliográficos
Autores principales: Alachiotis, Nikolaos, Berger, Simon, Flouri, Tomáš, Pissis, Solon P, Stamatakis, Alexandros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3821552/
https://www.ncbi.nlm.nih.gov/pubmed/24564250
http://dx.doi.org/10.1186/1471-2105-14-S11-S4
_version_ 1782290315657347072
author Alachiotis, Nikolaos
Berger, Simon
Flouri, Tomáš
Pissis, Solon P
Stamatakis, Alexandros
author_facet Alachiotis, Nikolaos
Berger, Simon
Flouri, Tomáš
Pissis, Solon P
Stamatakis, Alexandros
author_sort Alachiotis, Nikolaos
collection PubMed
description BACKGROUND: A wide variety of short-read alignment programmes have been published recently to tackle the problem of mapping millions of short reads to a reference genome, focusing on different aspects of the procedure such as time and memory efficiency, sensitivity, and accuracy. These tools allow for a small number of mismatches in the alignment; however, their ability to allow for gaps varies greatly, with many performing poorly or not allowing them at all. The seed-and-extend strategy is applied in most short-read alignment programmes. After aligning a substring of the reference sequence against the high-quality prefix of a short read--the seed--an important problem is to find the best possible alignment between a substring of the reference sequence succeeding and the remaining suffix of low quality of the read--extend. The fact that the reads are rather short and that the gap occurrence frequency observed in various studies is rather low suggest that aligning (parts of) those reads with a single gap is in fact desirable. RESULTS: In this article, we present libgapmis, a library for extending pairwise short-read alignments. Apart from the standard CPU version, it includes ultrafast SSE- and GPU-based implementations. libgapmis is based on an algorithm computing a modified version of the traditional dynamic-programming matrix for sequence alignment. Extensive experimental results demonstrate that the functions of the CPU version provided in this library accelerate the computations by a factor of 20 compared to other programmes. The analogous SSE- and GPU-based implementations accelerate the computations by a factor of 6 and 11, respectively, compared to the CPU version. The library also provides the user the flexibility to split the read into fragments, based on the observed gap occurrence frequency and the length of the read, thereby allowing for a variable, but bounded, number of gaps in the alignment. CONCLUSIONS: We present libgapmis, a library for extending pairwise short-read alignments. We show that libgapmis is better-suited and more efficient than existing algorithms for this task. The importance of our contribution is underlined by the fact that the provided functions may be seamlessly integrated into any short-read alignment pipeline. The open-source code of libgapmis is available at http://www.exelixis-lab.org/gapmis.
format Online
Article
Text
id pubmed-3821552
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38215522013-11-11 libgapmis: extending short-read alignments Alachiotis, Nikolaos Berger, Simon Flouri, Tomáš Pissis, Solon P Stamatakis, Alexandros BMC Bioinformatics Research BACKGROUND: A wide variety of short-read alignment programmes have been published recently to tackle the problem of mapping millions of short reads to a reference genome, focusing on different aspects of the procedure such as time and memory efficiency, sensitivity, and accuracy. These tools allow for a small number of mismatches in the alignment; however, their ability to allow for gaps varies greatly, with many performing poorly or not allowing them at all. The seed-and-extend strategy is applied in most short-read alignment programmes. After aligning a substring of the reference sequence against the high-quality prefix of a short read--the seed--an important problem is to find the best possible alignment between a substring of the reference sequence succeeding and the remaining suffix of low quality of the read--extend. The fact that the reads are rather short and that the gap occurrence frequency observed in various studies is rather low suggest that aligning (parts of) those reads with a single gap is in fact desirable. RESULTS: In this article, we present libgapmis, a library for extending pairwise short-read alignments. Apart from the standard CPU version, it includes ultrafast SSE- and GPU-based implementations. libgapmis is based on an algorithm computing a modified version of the traditional dynamic-programming matrix for sequence alignment. Extensive experimental results demonstrate that the functions of the CPU version provided in this library accelerate the computations by a factor of 20 compared to other programmes. The analogous SSE- and GPU-based implementations accelerate the computations by a factor of 6 and 11, respectively, compared to the CPU version. The library also provides the user the flexibility to split the read into fragments, based on the observed gap occurrence frequency and the length of the read, thereby allowing for a variable, but bounded, number of gaps in the alignment. CONCLUSIONS: We present libgapmis, a library for extending pairwise short-read alignments. We show that libgapmis is better-suited and more efficient than existing algorithms for this task. The importance of our contribution is underlined by the fact that the provided functions may be seamlessly integrated into any short-read alignment pipeline. The open-source code of libgapmis is available at http://www.exelixis-lab.org/gapmis. BioMed Central 2013-11-04 /pmc/articles/PMC3821552/ /pubmed/24564250 http://dx.doi.org/10.1186/1471-2105-14-S11-S4 Text en Copyright © 2013 Alachiotis et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Alachiotis, Nikolaos
Berger, Simon
Flouri, Tomáš
Pissis, Solon P
Stamatakis, Alexandros
libgapmis: extending short-read alignments
title libgapmis: extending short-read alignments
title_full libgapmis: extending short-read alignments
title_fullStr libgapmis: extending short-read alignments
title_full_unstemmed libgapmis: extending short-read alignments
title_short libgapmis: extending short-read alignments
title_sort libgapmis: extending short-read alignments
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3821552/
https://www.ncbi.nlm.nih.gov/pubmed/24564250
http://dx.doi.org/10.1186/1471-2105-14-S11-S4
work_keys_str_mv AT alachiotisnikolaos libgapmisextendingshortreadalignments
AT bergersimon libgapmisextendingshortreadalignments
AT flouritomas libgapmisextendingshortreadalignments
AT pississolonp libgapmisextendingshortreadalignments
AT stamatakisalexandros libgapmisextendingshortreadalignments