Cargando…

Long Read Alignment with Parallel MapReduce Cloud Platform

Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes...

Descripción completa

Detalles Bibliográficos
Autores principales: Al-Absi, Ahmed Abdulhakim, Kang, Dae-Ki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4709609/
https://www.ncbi.nlm.nih.gov/pubmed/26839887
http://dx.doi.org/10.1155/2015/807407
_version_ 1782409672733491200
author Al-Absi, Ahmed Abdulhakim
Kang, Dae-Ki
author_facet Al-Absi, Ahmed Abdulhakim
Kang, Dae-Ki
author_sort Al-Absi, Ahmed Abdulhakim
collection PubMed
description Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.
format Online
Article
Text
id pubmed-4709609
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-47096092016-02-02 Long Read Alignment with Parallel MapReduce Cloud Platform Al-Absi, Ahmed Abdulhakim Kang, Dae-Ki Biomed Res Int Research Article Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms. Hindawi Publishing Corporation 2015 2015-12-29 /pmc/articles/PMC4709609/ /pubmed/26839887 http://dx.doi.org/10.1155/2015/807407 Text en Copyright © 2015 A. A. Al-Absi and D.-K. Kang. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Al-Absi, Ahmed Abdulhakim
Kang, Dae-Ki
Long Read Alignment with Parallel MapReduce Cloud Platform
title Long Read Alignment with Parallel MapReduce Cloud Platform
title_full Long Read Alignment with Parallel MapReduce Cloud Platform
title_fullStr Long Read Alignment with Parallel MapReduce Cloud Platform
title_full_unstemmed Long Read Alignment with Parallel MapReduce Cloud Platform
title_short Long Read Alignment with Parallel MapReduce Cloud Platform
title_sort long read alignment with parallel mapreduce cloud platform
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4709609/
https://www.ncbi.nlm.nih.gov/pubmed/26839887
http://dx.doi.org/10.1155/2015/807407
work_keys_str_mv AT alabsiahmedabdulhakim longreadalignmentwithparallelmapreducecloudplatform
AT kangdaeki longreadalignmentwithparallelmapreducecloudplatform