Cargando…

High-throughput sequence alignment using Graphics Processing Units

BACKGROUND: The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, a...

Descripción completa

Detalles Bibliográficos
Autores principales: Schatz, Michael C, Trapnell, Cole, Delcher, Arthur L, Varshney, Amitabh
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2222658/
https://www.ncbi.nlm.nih.gov/pubmed/18070356
http://dx.doi.org/10.1186/1471-2105-8-474
_version_ 1782149366355591168
author Schatz, Michael C
Trapnell, Cole
Delcher, Arthur L
Varshney, Amitabh
author_facet Schatz, Michael C
Trapnell, Cole
Delcher, Arthur L
Varshney, Amitabh
author_sort Schatz, Michael C
collection PubMed
description BACKGROUND: The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. RESULTS: This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. CONCLUSION: MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.
format Text
id pubmed-2222658
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22226582008-02-01 High-throughput sequence alignment using Graphics Processing Units Schatz, Michael C Trapnell, Cole Delcher, Arthur L Varshney, Amitabh BMC Bioinformatics Software BACKGROUND: The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. RESULTS: This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. CONCLUSION: MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU. BioMed Central 2007-12-10 /pmc/articles/PMC2222658/ /pubmed/18070356 http://dx.doi.org/10.1186/1471-2105-8-474 Text en Copyright © 2007 Schatz et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Schatz, Michael C
Trapnell, Cole
Delcher, Arthur L
Varshney, Amitabh
High-throughput sequence alignment using Graphics Processing Units
title High-throughput sequence alignment using Graphics Processing Units
title_full High-throughput sequence alignment using Graphics Processing Units
title_fullStr High-throughput sequence alignment using Graphics Processing Units
title_full_unstemmed High-throughput sequence alignment using Graphics Processing Units
title_short High-throughput sequence alignment using Graphics Processing Units
title_sort high-throughput sequence alignment using graphics processing units
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2222658/
https://www.ncbi.nlm.nih.gov/pubmed/18070356
http://dx.doi.org/10.1186/1471-2105-8-474
work_keys_str_mv AT schatzmichaelc highthroughputsequencealignmentusinggraphicsprocessingunits
AT trapnellcole highthroughputsequencealignmentusinggraphicsprocessingunits
AT delcherarthurl highthroughputsequencealignmentusinggraphicsprocessingunits
AT varshneyamitabh highthroughputsequencealignmentusinggraphicsprocessingunits