Cargando…
GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller
BACKGROUND: Pairwise sequence alignment is widely used in many biological tools and applications. Existing GPU accelerated implementations mainly focus on calculating optimal alignment score and omit identifying the optimal alignment itself. In GATK HaplotypeCaller (HC), the semi-global pairwise seq...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6456962/ https://www.ncbi.nlm.nih.gov/pubmed/30967111 http://dx.doi.org/10.1186/s12864-019-5468-9 |
_version_ | 1783409835188944896 |
---|---|
author | Ren, Shanshan Ahmed, Nauman Bertels, Koen Al-Ars, Zaid |
author_facet | Ren, Shanshan Ahmed, Nauman Bertels, Koen Al-Ars, Zaid |
author_sort | Ren, Shanshan |
collection | PubMed |
description | BACKGROUND: Pairwise sequence alignment is widely used in many biological tools and applications. Existing GPU accelerated implementations mainly focus on calculating optimal alignment score and omit identifying the optimal alignment itself. In GATK HaplotypeCaller (HC), the semi-global pairwise sequence alignment with traceback has so far been difficult to accelerate effectively on GPUs. RESULTS: We first analyze the characteristics of the semi-global alignment with traceback in GATK HC and then propose a new algorithm that allows for retrieving the optimal alignment efficiently on GPUs. For the first stage, we choose intra-task parallelization model to calculate the position of the optimal alignment score and the backtracking matrix. Moreover, in the first stage, our GPU implementation also records the length of consecutive matches/mismatches in addition to lengths of consecutive insertions and deletions as in the CPU-based implementation. This helps efficiently retrieve the backtracking matrix to obtain the optimal alignment in the second stage. CONCLUSIONS: Experimental results show that our alignment kernel with traceback is up to 80x and 14.14x faster than its CPU counterpart with synthetic datasets and real datasets, respectively. When integrated into GATK HC (alongside a GPU accelerated pair-HMMs forward kernel), the overall acceleration is 2.3x faster than the baseline GATK HC implementation, and 1.34x faster than the GATK HC implementation with the integrated GPU-based pair-HMMs forward algorithm. Although the methods proposed in this paper is to improve the performance of GATK HC, they can also be used in other pairwise alignments and applications. |
format | Online Article Text |
id | pubmed-6456962 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-64569622019-04-19 GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller Ren, Shanshan Ahmed, Nauman Bertels, Koen Al-Ars, Zaid BMC Genomics Research BACKGROUND: Pairwise sequence alignment is widely used in many biological tools and applications. Existing GPU accelerated implementations mainly focus on calculating optimal alignment score and omit identifying the optimal alignment itself. In GATK HaplotypeCaller (HC), the semi-global pairwise sequence alignment with traceback has so far been difficult to accelerate effectively on GPUs. RESULTS: We first analyze the characteristics of the semi-global alignment with traceback in GATK HC and then propose a new algorithm that allows for retrieving the optimal alignment efficiently on GPUs. For the first stage, we choose intra-task parallelization model to calculate the position of the optimal alignment score and the backtracking matrix. Moreover, in the first stage, our GPU implementation also records the length of consecutive matches/mismatches in addition to lengths of consecutive insertions and deletions as in the CPU-based implementation. This helps efficiently retrieve the backtracking matrix to obtain the optimal alignment in the second stage. CONCLUSIONS: Experimental results show that our alignment kernel with traceback is up to 80x and 14.14x faster than its CPU counterpart with synthetic datasets and real datasets, respectively. When integrated into GATK HC (alongside a GPU accelerated pair-HMMs forward kernel), the overall acceleration is 2.3x faster than the baseline GATK HC implementation, and 1.34x faster than the GATK HC implementation with the integrated GPU-based pair-HMMs forward algorithm. Although the methods proposed in this paper is to improve the performance of GATK HC, they can also be used in other pairwise alignments and applications. BioMed Central 2019-04-04 /pmc/articles/PMC6456962/ /pubmed/30967111 http://dx.doi.org/10.1186/s12864-019-5468-9 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Ren, Shanshan Ahmed, Nauman Bertels, Koen Al-Ars, Zaid GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller |
title | GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller |
title_full | GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller |
title_fullStr | GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller |
title_full_unstemmed | GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller |
title_short | GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller |
title_sort | gpu accelerated sequence alignment with traceback for gatk haplotypecaller |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6456962/ https://www.ncbi.nlm.nih.gov/pubmed/30967111 http://dx.doi.org/10.1186/s12864-019-5468-9 |
work_keys_str_mv | AT renshanshan gpuacceleratedsequencealignmentwithtracebackforgatkhaplotypecaller AT ahmednauman gpuacceleratedsequencealignmentwithtracebackforgatkhaplotypecaller AT bertelskoen gpuacceleratedsequencealignmentwithtracebackforgatkhaplotypecaller AT alarszaid gpuacceleratedsequencealignmentwithtracebackforgatkhaplotypecaller |