Cargando…

Accelerating read mapping with FastHASH

With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xin, Hongyi, Lee, Donghyuk, Hormozdiari, Farhad, Yedkar, Samihan, Mutlu, Onur, Alkan, Can
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3549798/ https://www.ncbi.nlm.nih.gov/pubmed/23369189 http://dx.doi.org/10.1186/1471-2164-14-S1-S13

_version_	1782256472420253696
author	Xin, Hongyi Lee, Donghyuk Hormozdiari, Farhad Yedkar, Samihan Mutlu, Onur Alkan, Can
author_facet	Xin, Hongyi Lee, Donghyuk Hormozdiari, Farhad Yedkar, Samihan Mutlu, Onur Alkan, Can
author_sort	Xin, Hongyi
collection	PubMed
description	With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that can process and analyze the enormous amount of sequence data quickly and accurately. Unfortunately, the current read mapping algorithms have difficulties in coping with the massive amounts of data generated by NGS. We propose a new algorithm, FastHASH, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods. FastHASH is a generic algorithm compatible with all seed-and-extend class read mapping algorithms. It introduces two main techniques, namely Adjacency Filtering, and Cheap K-mer Selection. We implemented FastHASH and merged it into the codebase of the popular read mapping program, mrFAST. Depending on the edit distance cutoffs, we observed up to 19-fold speedup while still maintaining 100% sensitivity and high comprehensiveness.
format	Online Article Text
id	pubmed-3549798
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-35497982013-01-23 Accelerating read mapping with FastHASH Xin, Hongyi Lee, Donghyuk Hormozdiari, Farhad Yedkar, Samihan Mutlu, Onur Alkan, Can BMC Genomics Proceedings With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that can process and analyze the enormous amount of sequence data quickly and accurately. Unfortunately, the current read mapping algorithms have difficulties in coping with the massive amounts of data generated by NGS. We propose a new algorithm, FastHASH, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods. FastHASH is a generic algorithm compatible with all seed-and-extend class read mapping algorithms. It introduces two main techniques, namely Adjacency Filtering, and Cheap K-mer Selection. We implemented FastHASH and merged it into the codebase of the popular read mapping program, mrFAST. Depending on the edit distance cutoffs, we observed up to 19-fold speedup while still maintaining 100% sensitivity and high comprehensiveness. BioMed Central 2013-01-21 /pmc/articles/PMC3549798/ /pubmed/23369189 http://dx.doi.org/10.1186/1471-2164-14-S1-S13 Text en Copyright ©2013 Xin et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Xin, Hongyi Lee, Donghyuk Hormozdiari, Farhad Yedkar, Samihan Mutlu, Onur Alkan, Can Accelerating read mapping with FastHASH
title	Accelerating read mapping with FastHASH
title_full	Accelerating read mapping with FastHASH
title_fullStr	Accelerating read mapping with FastHASH
title_full_unstemmed	Accelerating read mapping with FastHASH
title_short	Accelerating read mapping with FastHASH
title_sort	accelerating read mapping with fasthash
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3549798/ https://www.ncbi.nlm.nih.gov/pubmed/23369189 http://dx.doi.org/10.1186/1471-2164-14-S1-S13
work_keys_str_mv	AT xinhongyi acceleratingreadmappingwithfasthash AT leedonghyuk acceleratingreadmappingwithfasthash AT hormozdiarifarhad acceleratingreadmappingwithfasthash AT yedkarsamihan acceleratingreadmappingwithfasthash AT mutluonur acceleratingreadmappingwithfasthash AT alkancan acceleratingreadmappingwithfasthash

Accelerating read mapping with FastHASH

Ejemplares similares