Cargando…

SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications

BACKGROUND: The Smith-Waterman algorithm, which produces the optimal pairwise alignment between two sequences, is frequently used as a key component of fast heuristic read mapping and variation detection tools for next-generation sequencing data. Though various fast Smith-Waterman implementations ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Mengyao, Lee, Wan-Ping, Garrison, Erik P., Marth, Gabor T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852983/
https://www.ncbi.nlm.nih.gov/pubmed/24324759
http://dx.doi.org/10.1371/journal.pone.0082138
_version_ 1782478758660276224
author Zhao, Mengyao
Lee, Wan-Ping
Garrison, Erik P.
Marth, Gabor T.
author_facet Zhao, Mengyao
Lee, Wan-Ping
Garrison, Erik P.
Marth, Gabor T.
author_sort Zhao, Mengyao
collection PubMed
description BACKGROUND: The Smith-Waterman algorithm, which produces the optimal pairwise alignment between two sequences, is frequently used as a key component of fast heuristic read mapping and variation detection tools for next-generation sequencing data. Though various fast Smith-Waterman implementations are developed, they are either designed as monolithic protein database searching tools, which do not return detailed alignment, or are embedded into other tools. These issues make reusing these efficient Smith-Waterman implementations impractical. RESULTS: To facilitate easy integration of the fast Single-Instruction-Multiple-Data Smith-Waterman algorithm into third-party software, we wrote a C/C++ library, which extends Farrar’s Striped Smith-Waterman (SSW) to return alignment information in addition to the optimal Smith-Waterman score. In this library we developed a new method to generate the full optimal alignment results and a suboptimal score in linear space at little cost of efficiency. This improvement makes the fast Single-Instruction-Multiple-Data Smith-Waterman become really useful in genomic applications. SSW is available both as a C/C++ software library, as well as a stand-alone alignment tool at: https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library. CONCLUSIONS: The SSW library has been used in the primary read mapping tool MOSAIK, the split-read mapping program SCISSORS, the MEI detector TANGRAM, and the read-overlap graph generation program RZMBLR. The speeds of the mentioned software are improved significantly by replacing their ordinary Smith-Waterman or banded Smith-Waterman module with the SSW Library.
format Online
Article
Text
id pubmed-3852983
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38529832013-12-09 SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications Zhao, Mengyao Lee, Wan-Ping Garrison, Erik P. Marth, Gabor T. PLoS One Research Article BACKGROUND: The Smith-Waterman algorithm, which produces the optimal pairwise alignment between two sequences, is frequently used as a key component of fast heuristic read mapping and variation detection tools for next-generation sequencing data. Though various fast Smith-Waterman implementations are developed, they are either designed as monolithic protein database searching tools, which do not return detailed alignment, or are embedded into other tools. These issues make reusing these efficient Smith-Waterman implementations impractical. RESULTS: To facilitate easy integration of the fast Single-Instruction-Multiple-Data Smith-Waterman algorithm into third-party software, we wrote a C/C++ library, which extends Farrar’s Striped Smith-Waterman (SSW) to return alignment information in addition to the optimal Smith-Waterman score. In this library we developed a new method to generate the full optimal alignment results and a suboptimal score in linear space at little cost of efficiency. This improvement makes the fast Single-Instruction-Multiple-Data Smith-Waterman become really useful in genomic applications. SSW is available both as a C/C++ software library, as well as a stand-alone alignment tool at: https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library. CONCLUSIONS: The SSW library has been used in the primary read mapping tool MOSAIK, the split-read mapping program SCISSORS, the MEI detector TANGRAM, and the read-overlap graph generation program RZMBLR. The speeds of the mentioned software are improved significantly by replacing their ordinary Smith-Waterman or banded Smith-Waterman module with the SSW Library. Public Library of Science 2013-12-04 /pmc/articles/PMC3852983/ /pubmed/24324759 http://dx.doi.org/10.1371/journal.pone.0082138 Text en © 2013 Zhao et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zhao, Mengyao
Lee, Wan-Ping
Garrison, Erik P.
Marth, Gabor T.
SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications
title SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications
title_full SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications
title_fullStr SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications
title_full_unstemmed SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications
title_short SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications
title_sort ssw library: an simd smith-waterman c/c++ library for use in genomic applications
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852983/
https://www.ncbi.nlm.nih.gov/pubmed/24324759
http://dx.doi.org/10.1371/journal.pone.0082138
work_keys_str_mv AT zhaomengyao sswlibraryansimdsmithwatermancclibraryforuseingenomicapplications
AT leewanping sswlibraryansimdsmithwatermancclibraryforuseingenomicapplications
AT garrisonerikp sswlibraryansimdsmithwatermancclibraryforuseingenomicapplications
AT marthgabort sswlibraryansimdsmithwatermancclibraryforuseingenomicapplications