Cargando…

Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++

The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown to obtain high quality mappings at the expense of long execution times. Finding fast yet...

Descripción completa

Detalles Bibliográficos
Autores principales:	González-Domínguez, Jorge, Liu, Yongchao, Schmidt, Bertil
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4711716/ https://www.ncbi.nlm.nih.gov/pubmed/26731399 http://dx.doi.org/10.1371/journal.pone.0145490

_version_	1782409972100890624
author	González-Domínguez, Jorge Liu, Yongchao Schmidt, Bertil
author_facet	González-Domínguez, Jorge Liu, Yongchao Schmidt, Bertil
author_sort	González-Domínguez, Jorge
collection	PubMed
description	The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown to obtain high quality mappings at the expense of long execution times. Finding fast yet accurate software solutions is of high importance to research, since availability and size of NGS datasets continue to increase. In this work we present an efficient parallelization approach for NGS short-read alignment on multi-core clusters. Our approach takes advantage of a distributed shared memory programming model based on the new UPC++ language. Experimental results using the CUSHAW3 aligner show that our implementation based on dynamic scheduling obtains good scalability on multi-core clusters. Through our evaluation, we are able to complete the single-end and paired-end alignments of 246 million reads of length 150 base-pairs in 11.54 and 16.64 minutes, respectively, using 32 nodes with four AMD Opteron 6272 16-core CPUs per node. In contrast, the multi-threaded original tool needs 2.77 and 5.54 hours to perform the same alignments on the 64 cores of one node. The source code of our parallel implementation is publicly available at the CUSHAW3 homepage (http://cushaw3.sourceforge.net).
format	Online Article Text
id	pubmed-4711716
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-47117162016-01-26 Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++ González-Domínguez, Jorge Liu, Yongchao Schmidt, Bertil PLoS One Research Article The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown to obtain high quality mappings at the expense of long execution times. Finding fast yet accurate software solutions is of high importance to research, since availability and size of NGS datasets continue to increase. In this work we present an efficient parallelization approach for NGS short-read alignment on multi-core clusters. Our approach takes advantage of a distributed shared memory programming model based on the new UPC++ language. Experimental results using the CUSHAW3 aligner show that our implementation based on dynamic scheduling obtains good scalability on multi-core clusters. Through our evaluation, we are able to complete the single-end and paired-end alignments of 246 million reads of length 150 base-pairs in 11.54 and 16.64 minutes, respectively, using 32 nodes with four AMD Opteron 6272 16-core CPUs per node. In contrast, the multi-threaded original tool needs 2.77 and 5.54 hours to perform the same alignments on the 64 cores of one node. The source code of our parallel implementation is publicly available at the CUSHAW3 homepage (http://cushaw3.sourceforge.net). Public Library of Science 2016-01-05 /pmc/articles/PMC4711716/ /pubmed/26731399 http://dx.doi.org/10.1371/journal.pone.0145490 Text en © 2016 González-Domínguez et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
spellingShingle	Research Article González-Domínguez, Jorge Liu, Yongchao Schmidt, Bertil Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++
title	Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++
title_full	Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++
title_fullStr	Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++
title_full_unstemmed	Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++
title_short	Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++
title_sort	parallel and scalable short-read alignment on multi-core clusters using upc++
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4711716/ https://www.ncbi.nlm.nih.gov/pubmed/26731399 http://dx.doi.org/10.1371/journal.pone.0145490
work_keys_str_mv	AT gonzalezdominguezjorge parallelandscalableshortreadalignmentonmulticoreclustersusingupc AT liuyongchao parallelandscalableshortreadalignmentonmulticoreclustersusingupc AT schmidtbertil parallelandscalableshortreadalignmentonmulticoreclustersusingupc

Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++

Ejemplares similares