Cargando…

HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis

Due to the size of Next-Generation Sequencing data, the computational challenge of sequence alignment has been vast. Inexact alignments can take up to 90% of total CPU time in bioinformatics pipelines. High-performance Integrated Virtual Environment (HIVE), a cloud-based environment optimized for st...

Descripción completa

Detalles Bibliográficos
Autores principales: Santana-Quintero, Luis, Dingerdissen, Hayley, Thierry-Mieg, Jean, Mazumder, Raja, Simonyan, Vahan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053384/
https://www.ncbi.nlm.nih.gov/pubmed/24918764
http://dx.doi.org/10.1371/journal.pone.0099033
_version_ 1782320370679808000
author Santana-Quintero, Luis
Dingerdissen, Hayley
Thierry-Mieg, Jean
Mazumder, Raja
Simonyan, Vahan
author_facet Santana-Quintero, Luis
Dingerdissen, Hayley
Thierry-Mieg, Jean
Mazumder, Raja
Simonyan, Vahan
author_sort Santana-Quintero, Luis
collection PubMed
description Due to the size of Next-Generation Sequencing data, the computational challenge of sequence alignment has been vast. Inexact alignments can take up to 90% of total CPU time in bioinformatics pipelines. High-performance Integrated Virtual Environment (HIVE), a cloud-based environment optimized for storage and analysis of extra-large data, presents an algorithmic solution: the HIVE-hexagon DNA sequence aligner. HIVE-hexagon implements novel approaches to exploit both characteristics of sequence space and CPU, RAM and Input/Output (I/O) architecture to quickly compute accurate alignments. Key components of HIVE-hexagon include non-redundification and sorting of sequences; floating diagonals of linearized dynamic programming matrices; and consideration of cross-similarity to minimize computations. AVAILABILITY: https://hive.biochemistry.gwu.edu/hive/
format Online
Article
Text
id pubmed-4053384
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-40533842014-06-18 HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis Santana-Quintero, Luis Dingerdissen, Hayley Thierry-Mieg, Jean Mazumder, Raja Simonyan, Vahan PLoS One Research Article Due to the size of Next-Generation Sequencing data, the computational challenge of sequence alignment has been vast. Inexact alignments can take up to 90% of total CPU time in bioinformatics pipelines. High-performance Integrated Virtual Environment (HIVE), a cloud-based environment optimized for storage and analysis of extra-large data, presents an algorithmic solution: the HIVE-hexagon DNA sequence aligner. HIVE-hexagon implements novel approaches to exploit both characteristics of sequence space and CPU, RAM and Input/Output (I/O) architecture to quickly compute accurate alignments. Key components of HIVE-hexagon include non-redundification and sorting of sequences; floating diagonals of linearized dynamic programming matrices; and consideration of cross-similarity to minimize computations. AVAILABILITY: https://hive.biochemistry.gwu.edu/hive/ Public Library of Science 2014-06-11 /pmc/articles/PMC4053384/ /pubmed/24918764 http://dx.doi.org/10.1371/journal.pone.0099033 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Santana-Quintero, Luis
Dingerdissen, Hayley
Thierry-Mieg, Jean
Mazumder, Raja
Simonyan, Vahan
HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis
title HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis
title_full HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis
title_fullStr HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis
title_full_unstemmed HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis
title_short HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis
title_sort hive-hexagon: high-performance, parallelized sequence alignment for next-generation sequencing data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053384/
https://www.ncbi.nlm.nih.gov/pubmed/24918764
http://dx.doi.org/10.1371/journal.pone.0099033
work_keys_str_mv AT santanaquinteroluis hivehexagonhighperformanceparallelizedsequencealignmentfornextgenerationsequencingdataanalysis
AT dingerdissenhayley hivehexagonhighperformanceparallelizedsequencealignmentfornextgenerationsequencingdataanalysis
AT thierrymiegjean hivehexagonhighperformanceparallelizedsequencealignmentfornextgenerationsequencingdataanalysis
AT mazumderraja hivehexagonhighperformanceparallelizedsequencealignmentfornextgenerationsequencingdataanalysis
AT simonyanvahan hivehexagonhighperformanceparallelizedsequencealignmentfornextgenerationsequencingdataanalysis