Cargando…

BitMapper: an efficient all-mapper based on bit-vector computing

BACKGROUND: As the next-generation sequencing (NGS) technologies producing hundreds of millions of reads every day, a tremendous computational challenge is to map NGS reads to a given reference genome efficiently. However, existing methods of all-mappers, which aim at finding all mapping locations o...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Haoyu, Jiang, Huaipan, Yang, Jiaoyun, Xu, Yun, Shang, Yi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4462005/
https://www.ncbi.nlm.nih.gov/pubmed/26063651
http://dx.doi.org/10.1186/s12859-015-0626-9
_version_ 1782375592748908544
author Cheng, Haoyu
Jiang, Huaipan
Yang, Jiaoyun
Xu, Yun
Shang, Yi
author_facet Cheng, Haoyu
Jiang, Huaipan
Yang, Jiaoyun
Xu, Yun
Shang, Yi
author_sort Cheng, Haoyu
collection PubMed
description BACKGROUND: As the next-generation sequencing (NGS) technologies producing hundreds of millions of reads every day, a tremendous computational challenge is to map NGS reads to a given reference genome efficiently. However, existing methods of all-mappers, which aim at finding all mapping locations of each read, are very time consuming. The majority of existing all-mappers consist of 2 main parts, filtration and verification. This work significantly reduces verification time, which is the dominant part of the running time. RESULTS: An efficient all-mapper, BitMapper, is developed based on a new vectorized bit-vector algorithm, which simultaneously calculates the edit distance of one read to multiple locations in a given reference genome. Experimental results on both simulated and real data sets show that BitMapper is from several times to an order of magnitude faster than the current state-of-the-art all-mappers, while achieving higher sensitivity, i.e., better quality solutions. CONCLUSIONS: We present BitMapper, which is designed to return all mapping locations of raw reads containing indels as well as mismatches. BitMapper is implemented in C under a GPL license. Binaries are freely available at http://home.ustc.edu.cn/%7Echhy. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0626-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4462005
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44620052015-06-11 BitMapper: an efficient all-mapper based on bit-vector computing Cheng, Haoyu Jiang, Huaipan Yang, Jiaoyun Xu, Yun Shang, Yi BMC Bioinformatics Methodology Article BACKGROUND: As the next-generation sequencing (NGS) technologies producing hundreds of millions of reads every day, a tremendous computational challenge is to map NGS reads to a given reference genome efficiently. However, existing methods of all-mappers, which aim at finding all mapping locations of each read, are very time consuming. The majority of existing all-mappers consist of 2 main parts, filtration and verification. This work significantly reduces verification time, which is the dominant part of the running time. RESULTS: An efficient all-mapper, BitMapper, is developed based on a new vectorized bit-vector algorithm, which simultaneously calculates the edit distance of one read to multiple locations in a given reference genome. Experimental results on both simulated and real data sets show that BitMapper is from several times to an order of magnitude faster than the current state-of-the-art all-mappers, while achieving higher sensitivity, i.e., better quality solutions. CONCLUSIONS: We present BitMapper, which is designed to return all mapping locations of raw reads containing indels as well as mismatches. BitMapper is implemented in C under a GPL license. Binaries are freely available at http://home.ustc.edu.cn/%7Echhy. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0626-9) contains supplementary material, which is available to authorized users. BioMed Central 2015-06-11 /pmc/articles/PMC4462005/ /pubmed/26063651 http://dx.doi.org/10.1186/s12859-015-0626-9 Text en © Cheng et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Cheng, Haoyu
Jiang, Huaipan
Yang, Jiaoyun
Xu, Yun
Shang, Yi
BitMapper: an efficient all-mapper based on bit-vector computing
title BitMapper: an efficient all-mapper based on bit-vector computing
title_full BitMapper: an efficient all-mapper based on bit-vector computing
title_fullStr BitMapper: an efficient all-mapper based on bit-vector computing
title_full_unstemmed BitMapper: an efficient all-mapper based on bit-vector computing
title_short BitMapper: an efficient all-mapper based on bit-vector computing
title_sort bitmapper: an efficient all-mapper based on bit-vector computing
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4462005/
https://www.ncbi.nlm.nih.gov/pubmed/26063651
http://dx.doi.org/10.1186/s12859-015-0626-9
work_keys_str_mv AT chenghaoyu bitmapperanefficientallmapperbasedonbitvectorcomputing
AT jianghuaipan bitmapperanefficientallmapperbasedonbitvectorcomputing
AT yangjiaoyun bitmapperanefficientallmapperbasedonbitvectorcomputing
AT xuyun bitmapperanefficientallmapperbasedonbitvectorcomputing
AT shangyi bitmapperanefficientallmapperbasedonbitvectorcomputing