Cargando…

Improving read mapping using additional prefix grams

BACKGROUND: Next-generation sequencing (NGS) enables rapid production of billions of bases at a relatively low cost. Mapping reads from next-generation sequencers to a given reference genome is an important first step in many sequencing applications. Popular read mappers, such as Bowtie and BWA, are...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Jongik, Li, Chen, Xie, Xiaohui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3927682/
https://www.ncbi.nlm.nih.gov/pubmed/24499321
http://dx.doi.org/10.1186/1471-2105-15-42
_version_ 1782304164939825152
author Kim, Jongik
Li, Chen
Xie, Xiaohui
author_facet Kim, Jongik
Li, Chen
Xie, Xiaohui
author_sort Kim, Jongik
collection PubMed
description BACKGROUND: Next-generation sequencing (NGS) enables rapid production of billions of bases at a relatively low cost. Mapping reads from next-generation sequencers to a given reference genome is an important first step in many sequencing applications. Popular read mappers, such as Bowtie and BWA, are optimized to return top one or a few candidate locations of each read. However, identifying all mapping locations of each read, instead of just one or a few, is also important in some sequencing applications such as ChIP-seq for discovering binding sites in repeat regions, and RNA-seq for transcript abundance estimation. RESULTS: Here we present Hobbes2, a software package designed for fast and accurate alignment of NGS reads and specialized in identifying all mapping locations of each read. Hobbes2 efficiently identifies all mapping locations of reads using a novel technique that utilizes additional prefix q-grams to improve filtering. We extensively compare Hobbes2 with state-of-the-art read mappers, and show that Hobbes2 can be an order of magnitude faster than other read mappers while consuming less memory space and achieving similar accuracy. CONCLUSIONS: We propose Hobbes2 to improve the accuracy of read mapping, specialized in identifying all mapping locations of each read. Hobbes2 is implemented in C++, and the source code is freely available for download at http://hobbes.ics.uci.edu.
format Online
Article
Text
id pubmed-3927682
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39276822014-05-10 Improving read mapping using additional prefix grams Kim, Jongik Li, Chen Xie, Xiaohui BMC Bioinformatics Methodology Article BACKGROUND: Next-generation sequencing (NGS) enables rapid production of billions of bases at a relatively low cost. Mapping reads from next-generation sequencers to a given reference genome is an important first step in many sequencing applications. Popular read mappers, such as Bowtie and BWA, are optimized to return top one or a few candidate locations of each read. However, identifying all mapping locations of each read, instead of just one or a few, is also important in some sequencing applications such as ChIP-seq for discovering binding sites in repeat regions, and RNA-seq for transcript abundance estimation. RESULTS: Here we present Hobbes2, a software package designed for fast and accurate alignment of NGS reads and specialized in identifying all mapping locations of each read. Hobbes2 efficiently identifies all mapping locations of reads using a novel technique that utilizes additional prefix q-grams to improve filtering. We extensively compare Hobbes2 with state-of-the-art read mappers, and show that Hobbes2 can be an order of magnitude faster than other read mappers while consuming less memory space and achieving similar accuracy. CONCLUSIONS: We propose Hobbes2 to improve the accuracy of read mapping, specialized in identifying all mapping locations of each read. Hobbes2 is implemented in C++, and the source code is freely available for download at http://hobbes.ics.uci.edu. BioMed Central 2014-02-05 /pmc/articles/PMC3927682/ /pubmed/24499321 http://dx.doi.org/10.1186/1471-2105-15-42 Text en Copyright © 2014 Kim et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Kim, Jongik
Li, Chen
Xie, Xiaohui
Improving read mapping using additional prefix grams
title Improving read mapping using additional prefix grams
title_full Improving read mapping using additional prefix grams
title_fullStr Improving read mapping using additional prefix grams
title_full_unstemmed Improving read mapping using additional prefix grams
title_short Improving read mapping using additional prefix grams
title_sort improving read mapping using additional prefix grams
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3927682/
https://www.ncbi.nlm.nih.gov/pubmed/24499321
http://dx.doi.org/10.1186/1471-2105-15-42
work_keys_str_mv AT kimjongik improvingreadmappingusingadditionalprefixgrams
AT lichen improvingreadmappingusingadditionalprefixgrams
AT xiexiaohui improvingreadmappingusingadditionalprefixgrams