Cargando…

Hobbes: optimized gram-based methods for efficient read alignment

Recent advances in sequencing technology have enabled the rapid generation of billions of bases at relatively low cost. A crucial first step in many sequencing applications is to map those reads to a reference genome. However, when the reference genome is large, finding accurate mappings poses a sig...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahmadi, Athena, Behm, Alexander, Honnalli, Nagesh, Li, Chen, Weng, Lingjie, Xie, Xiaohui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315303/
https://www.ncbi.nlm.nih.gov/pubmed/22199254
http://dx.doi.org/10.1093/nar/gkr1246
_version_ 1782228210149228544
author Ahmadi, Athena
Behm, Alexander
Honnalli, Nagesh
Li, Chen
Weng, Lingjie
Xie, Xiaohui
author_facet Ahmadi, Athena
Behm, Alexander
Honnalli, Nagesh
Li, Chen
Weng, Lingjie
Xie, Xiaohui
author_sort Ahmadi, Athena
collection PubMed
description Recent advances in sequencing technology have enabled the rapid generation of billions of bases at relatively low cost. A crucial first step in many sequencing applications is to map those reads to a reference genome. However, when the reference genome is large, finding accurate mappings poses a significant computational challenge due to the sheer amount of reads, and because many reads map to the reference sequence approximately but not exactly. We introduce Hobbes, a new gram-based program for aligning short reads, supporting Hamming and edit distance. Hobbes implements two novel techniques, which yield substantial performance improvements: an optimized gram-selection procedure for reads, and a cache-efficient filter for pruning candidate mappings. We systematically tested the performance of Hobbes on both real and simulated data with read lengths varying from 35 to 100 bp, and compared its performance with several state-of-the-art read-mapping programs, including Bowtie, BWA, mrsFast and RazerS. Hobbes is faster than all other read mapping programs we have tested while maintaining high mapping quality. Hobbes is about five times faster than Bowtie and about 2–10 times faster than BWA, depending on read length and error rate, when asked to find all mapping locations of a read in the human genome within a given Hamming or edit distance, respectively. Hobbes supports the SAM output format and is publicly available at http://hobbes.ics.uci.edu.
format Online
Article
Text
id pubmed-3315303
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33153032012-03-30 Hobbes: optimized gram-based methods for efficient read alignment Ahmadi, Athena Behm, Alexander Honnalli, Nagesh Li, Chen Weng, Lingjie Xie, Xiaohui Nucleic Acids Res Methods Online Recent advances in sequencing technology have enabled the rapid generation of billions of bases at relatively low cost. A crucial first step in many sequencing applications is to map those reads to a reference genome. However, when the reference genome is large, finding accurate mappings poses a significant computational challenge due to the sheer amount of reads, and because many reads map to the reference sequence approximately but not exactly. We introduce Hobbes, a new gram-based program for aligning short reads, supporting Hamming and edit distance. Hobbes implements two novel techniques, which yield substantial performance improvements: an optimized gram-selection procedure for reads, and a cache-efficient filter for pruning candidate mappings. We systematically tested the performance of Hobbes on both real and simulated data with read lengths varying from 35 to 100 bp, and compared its performance with several state-of-the-art read-mapping programs, including Bowtie, BWA, mrsFast and RazerS. Hobbes is faster than all other read mapping programs we have tested while maintaining high mapping quality. Hobbes is about five times faster than Bowtie and about 2–10 times faster than BWA, depending on read length and error rate, when asked to find all mapping locations of a read in the human genome within a given Hamming or edit distance, respectively. Hobbes supports the SAM output format and is publicly available at http://hobbes.ics.uci.edu. Oxford University Press 2012-03 2011-12-22 /pmc/articles/PMC3315303/ /pubmed/22199254 http://dx.doi.org/10.1093/nar/gkr1246 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Ahmadi, Athena
Behm, Alexander
Honnalli, Nagesh
Li, Chen
Weng, Lingjie
Xie, Xiaohui
Hobbes: optimized gram-based methods for efficient read alignment
title Hobbes: optimized gram-based methods for efficient read alignment
title_full Hobbes: optimized gram-based methods for efficient read alignment
title_fullStr Hobbes: optimized gram-based methods for efficient read alignment
title_full_unstemmed Hobbes: optimized gram-based methods for efficient read alignment
title_short Hobbes: optimized gram-based methods for efficient read alignment
title_sort hobbes: optimized gram-based methods for efficient read alignment
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315303/
https://www.ncbi.nlm.nih.gov/pubmed/22199254
http://dx.doi.org/10.1093/nar/gkr1246
work_keys_str_mv AT ahmadiathena hobbesoptimizedgrambasedmethodsforefficientreadalignment
AT behmalexander hobbesoptimizedgrambasedmethodsforefficientreadalignment
AT honnallinagesh hobbesoptimizedgrambasedmethodsforefficientreadalignment
AT lichen hobbesoptimizedgrambasedmethodsforefficientreadalignment
AT wenglingjie hobbesoptimizedgrambasedmethodsforefficientreadalignment
AT xiexiaohui hobbesoptimizedgrambasedmethodsforefficientreadalignment