Cargando…

SW#db: GPU-Accelerated Exact Sequence Similarity Database Search

In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result–the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goal...

Descripción completa

Detalles Bibliográficos
Autores principales:	Korpar, Matija, Šošić, Martin, Blažeka, Dino, Šikić, Mile
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2015
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4699916/ https://www.ncbi.nlm.nih.gov/pubmed/26719890 http://dx.doi.org/10.1371/journal.pone.0145857

_version_	1782408250597048320
author	Korpar, Matija Šošić, Martin Blažeka, Dino Šikić, Mile
author_facet	Korpar, Matija Šošić, Martin Blažeka, Dino Šikić, Mile
author_sort	Korpar, Matija
collection	PubMed
description	In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result–the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4–5 times faster than SSEARCH, 6–25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases
format	Online Article Text
id	pubmed-4699916
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-46999162016-01-14 SW#db: GPU-Accelerated Exact Sequence Similarity Database Search Korpar, Matija Šošić, Martin Blažeka, Dino Šikić, Mile PLoS One Research Article In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result–the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4–5 times faster than SSEARCH, 6–25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases Public Library of Science 2015-12-31 /pmc/articles/PMC4699916/ /pubmed/26719890 http://dx.doi.org/10.1371/journal.pone.0145857 Text en © 2015 Korpar et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Korpar, Matija Šošić, Martin Blažeka, Dino Šikić, Mile SW#db: GPU-Accelerated Exact Sequence Similarity Database Search
title	SW#db: GPU-Accelerated Exact Sequence Similarity Database Search
title_full	SW#db: GPU-Accelerated Exact Sequence Similarity Database Search
title_fullStr	SW#db: GPU-Accelerated Exact Sequence Similarity Database Search
title_full_unstemmed	SW#db: GPU-Accelerated Exact Sequence Similarity Database Search
title_short	SW#db: GPU-Accelerated Exact Sequence Similarity Database Search
title_sort	sw#db: gpu-accelerated exact sequence similarity database search
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4699916/ https://www.ncbi.nlm.nih.gov/pubmed/26719890 http://dx.doi.org/10.1371/journal.pone.0145857
work_keys_str_mv	AT korparmatija swdbgpuacceleratedexactsequencesimilaritydatabasesearch AT sosicmartin swdbgpuacceleratedexactsequencesimilaritydatabasesearch AT blazekadino swdbgpuacceleratedexactsequencesimilaritydatabasesearch AT sikicmile swdbgpuacceleratedexactsequencesimilaritydatabasesearch

SW#db: GPU-Accelerated Exact Sequence Similarity Database Search

Ejemplares similares