Cargando…
Improving read mapping using additional prefix grams
BACKGROUND: Next-generation sequencing (NGS) enables rapid production of billions of bases at a relatively low cost. Mapping reads from next-generation sequencers to a given reference genome is an important first step in many sequencing applications. Popular read mappers, such as Bowtie and BWA, are...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3927682/ https://www.ncbi.nlm.nih.gov/pubmed/24499321 http://dx.doi.org/10.1186/1471-2105-15-42 |
_version_ | 1782304164939825152 |
---|---|
author | Kim, Jongik Li, Chen Xie, Xiaohui |
author_facet | Kim, Jongik Li, Chen Xie, Xiaohui |
author_sort | Kim, Jongik |
collection | PubMed |
description | BACKGROUND: Next-generation sequencing (NGS) enables rapid production of billions of bases at a relatively low cost. Mapping reads from next-generation sequencers to a given reference genome is an important first step in many sequencing applications. Popular read mappers, such as Bowtie and BWA, are optimized to return top one or a few candidate locations of each read. However, identifying all mapping locations of each read, instead of just one or a few, is also important in some sequencing applications such as ChIP-seq for discovering binding sites in repeat regions, and RNA-seq for transcript abundance estimation. RESULTS: Here we present Hobbes2, a software package designed for fast and accurate alignment of NGS reads and specialized in identifying all mapping locations of each read. Hobbes2 efficiently identifies all mapping locations of reads using a novel technique that utilizes additional prefix q-grams to improve filtering. We extensively compare Hobbes2 with state-of-the-art read mappers, and show that Hobbes2 can be an order of magnitude faster than other read mappers while consuming less memory space and achieving similar accuracy. CONCLUSIONS: We propose Hobbes2 to improve the accuracy of read mapping, specialized in identifying all mapping locations of each read. Hobbes2 is implemented in C++, and the source code is freely available for download at http://hobbes.ics.uci.edu. |
format | Online Article Text |
id | pubmed-3927682 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39276822014-05-10 Improving read mapping using additional prefix grams Kim, Jongik Li, Chen Xie, Xiaohui BMC Bioinformatics Methodology Article BACKGROUND: Next-generation sequencing (NGS) enables rapid production of billions of bases at a relatively low cost. Mapping reads from next-generation sequencers to a given reference genome is an important first step in many sequencing applications. Popular read mappers, such as Bowtie and BWA, are optimized to return top one or a few candidate locations of each read. However, identifying all mapping locations of each read, instead of just one or a few, is also important in some sequencing applications such as ChIP-seq for discovering binding sites in repeat regions, and RNA-seq for transcript abundance estimation. RESULTS: Here we present Hobbes2, a software package designed for fast and accurate alignment of NGS reads and specialized in identifying all mapping locations of each read. Hobbes2 efficiently identifies all mapping locations of reads using a novel technique that utilizes additional prefix q-grams to improve filtering. We extensively compare Hobbes2 with state-of-the-art read mappers, and show that Hobbes2 can be an order of magnitude faster than other read mappers while consuming less memory space and achieving similar accuracy. CONCLUSIONS: We propose Hobbes2 to improve the accuracy of read mapping, specialized in identifying all mapping locations of each read. Hobbes2 is implemented in C++, and the source code is freely available for download at http://hobbes.ics.uci.edu. BioMed Central 2014-02-05 /pmc/articles/PMC3927682/ /pubmed/24499321 http://dx.doi.org/10.1186/1471-2105-15-42 Text en Copyright © 2014 Kim et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Kim, Jongik Li, Chen Xie, Xiaohui Improving read mapping using additional prefix grams |
title | Improving read mapping using additional prefix grams |
title_full | Improving read mapping using additional prefix grams |
title_fullStr | Improving read mapping using additional prefix grams |
title_full_unstemmed | Improving read mapping using additional prefix grams |
title_short | Improving read mapping using additional prefix grams |
title_sort | improving read mapping using additional prefix grams |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3927682/ https://www.ncbi.nlm.nih.gov/pubmed/24499321 http://dx.doi.org/10.1186/1471-2105-15-42 |
work_keys_str_mv | AT kimjongik improvingreadmappingusingadditionalprefixgrams AT lichen improvingreadmappingusingadditionalprefixgrams AT xiexiaohui improvingreadmappingusingadditionalprefixgrams |