Cargando…

Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel

BACKGROUND: Aligning short DNA reads to a reference sequence alignment is a prerequisite for detecting their biological origin and analyzing them in a phylogenetic context. With the PaPaRa tool we introduced a dedicated dynamic programming algorithm for simultaneously aligning short reads to referen...

Descripción completa

Detalles Bibliográficos
Autores principales:	Alachiotis, Nikolaos, Berger, Simon A, Stamatakis, Alexandros
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3496624/ https://www.ncbi.nlm.nih.gov/pubmed/22876807 http://dx.doi.org/10.1186/1471-2105-13-196

_version_	1782249651518308352
author	Alachiotis, Nikolaos Berger, Simon A Stamatakis, Alexandros
author_facet	Alachiotis, Nikolaos Berger, Simon A Stamatakis, Alexandros
author_sort	Alachiotis, Nikolaos
collection	PubMed
description	BACKGROUND: Aligning short DNA reads to a reference sequence alignment is a prerequisite for detecting their biological origin and analyzing them in a phylogenetic context. With the PaPaRa tool we introduced a dedicated dynamic programming algorithm for simultaneously aligning short reads to reference alignments and corresponding evolutionary reference trees. The algorithm aligns short reads to phylogenetic profiles that correspond to the branches of such a reference tree. The algorithm needs to perform an immense number of pairwise alignments. Therefore, we explore vector intrinsics and GPUs to accelerate the PaPaRa alignment kernel. RESULTS: We optimized and parallelized PaPaRa on CPUs and GPUs. Via SSE 4.1 SIMD (Single Instruction, Multiple Data) intrinsics for x86 SIMD architectures and multi-threading, we obtained a 9-fold acceleration on a single core as well as linear speedups with respect to the number of cores. The peak CPU performance amounts to 18.1 GCUPS (Giga Cell Updates per Second) using all four physical cores on an Intel i7 2600 CPU running at 3.4 GHz. The average CPU performance (averaged over all test runs) is 12.33 GCUPS. We also used OpenCL to execute PaPaRa on a GPU SIMT (Single Instruction, Multiple Threads) architecture. A NVIDIA GeForce 560 GPU delivered peak and average performance of 22.1 and 18.4 GCUPS respectively. Finally, we combined the SIMD and SIMT implementations into a hybrid CPU-GPU system that achieved an accumulated peak performance of 33.8 GCUPS. CONCLUSIONS: This accelerated version of PaPaRa (available at http://www.exelixis-lab.org/software.html) provides a significant performance improvement that allows for analyzing larger datasets in less time. We observe that state-of-the-art SIMD and SIMT architectures deliver comparable performance for this dynamic programming kernel when the “competing programmer approach” is deployed. Finally, we show that overall performance can be substantially increased by designing a hybrid CPU-GPU system with appropriate load distribution mechanisms.
format	Online Article Text
id	pubmed-3496624
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-34966242012-11-19 Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel Alachiotis, Nikolaos Berger, Simon A Stamatakis, Alexandros BMC Bioinformatics Research Article BACKGROUND: Aligning short DNA reads to a reference sequence alignment is a prerequisite for detecting their biological origin and analyzing them in a phylogenetic context. With the PaPaRa tool we introduced a dedicated dynamic programming algorithm for simultaneously aligning short reads to reference alignments and corresponding evolutionary reference trees. The algorithm aligns short reads to phylogenetic profiles that correspond to the branches of such a reference tree. The algorithm needs to perform an immense number of pairwise alignments. Therefore, we explore vector intrinsics and GPUs to accelerate the PaPaRa alignment kernel. RESULTS: We optimized and parallelized PaPaRa on CPUs and GPUs. Via SSE 4.1 SIMD (Single Instruction, Multiple Data) intrinsics for x86 SIMD architectures and multi-threading, we obtained a 9-fold acceleration on a single core as well as linear speedups with respect to the number of cores. The peak CPU performance amounts to 18.1 GCUPS (Giga Cell Updates per Second) using all four physical cores on an Intel i7 2600 CPU running at 3.4 GHz. The average CPU performance (averaged over all test runs) is 12.33 GCUPS. We also used OpenCL to execute PaPaRa on a GPU SIMT (Single Instruction, Multiple Threads) architecture. A NVIDIA GeForce 560 GPU delivered peak and average performance of 22.1 and 18.4 GCUPS respectively. Finally, we combined the SIMD and SIMT implementations into a hybrid CPU-GPU system that achieved an accumulated peak performance of 33.8 GCUPS. CONCLUSIONS: This accelerated version of PaPaRa (available at http://www.exelixis-lab.org/software.html) provides a significant performance improvement that allows for analyzing larger datasets in less time. We observe that state-of-the-art SIMD and SIMT architectures deliver comparable performance for this dynamic programming kernel when the “competing programmer approach” is deployed. Finally, we show that overall performance can be substantially increased by designing a hybrid CPU-GPU system with appropriate load distribution mechanisms. BioMed Central 2012-08-09 /pmc/articles/PMC3496624/ /pubmed/22876807 http://dx.doi.org/10.1186/1471-2105-13-196 Text en Copyright ©2012 Alachiotis et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Alachiotis, Nikolaos Berger, Simon A Stamatakis, Alexandros Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel
title	Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel
title_full	Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel
title_fullStr	Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel
title_full_unstemmed	Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel
title_short	Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel
title_sort	coupling simd and simt architectures to boost performance of a phylogeny-aware alignment kernel
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3496624/ https://www.ncbi.nlm.nih.gov/pubmed/22876807 http://dx.doi.org/10.1186/1471-2105-13-196
work_keys_str_mv	AT alachiotisnikolaos couplingsimdandsimtarchitecturestoboostperformanceofaphylogenyawarealignmentkernel AT bergersimona couplingsimdandsimtarchitecturestoboostperformanceofaphylogenyawarealignmentkernel AT stamatakisalexandros couplingsimdandsimtarchitecturestoboostperformanceofaphylogenyawarealignmentkernel

Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel

Ejemplares similares