Cargando…

Fast and accurate read mapping with approximate seeds and multiple backtracking

We present Masai, a read mapper representing the state-of-the-art in terms of speed and accuracy. Our tool is an order of magnitude faster than RazerS 3 and mrFAST, 2–4 times faster and more accurate than Bowtie 2 and BWA. The novelties of our read mapper are filtration with approximate seeds and a...

Descripción completa

Detalles Bibliográficos
Autores principales: Siragusa, Enrico, Weese, David, Reinert, Knut
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3627565/
https://www.ncbi.nlm.nih.gov/pubmed/23358824
http://dx.doi.org/10.1093/nar/gkt005
_version_ 1782266318284652544
author Siragusa, Enrico
Weese, David
Reinert, Knut
author_facet Siragusa, Enrico
Weese, David
Reinert, Knut
author_sort Siragusa, Enrico
collection PubMed
description We present Masai, a read mapper representing the state-of-the-art in terms of speed and accuracy. Our tool is an order of magnitude faster than RazerS 3 and mrFAST, 2–4 times faster and more accurate than Bowtie 2 and BWA. The novelties of our read mapper are filtration with approximate seeds and a method for multiple backtracking. Approximate seeds, compared with exact seeds, increase filtration specificity while preserving sensitivity. Multiple backtracking amortizes the cost of searching a large set of seeds by taking advantage of the repetitiveness of next-generation sequencing data. Combined together, these two methods significantly speed up approximate search on genomic data sets. Masai is implemented in C++ using the SeqAn library. The source code is distributed under the BSD license and binaries for Linux, Mac OS X and Windows can be freely downloaded from http://www.seqan.de/projects/masai.
format Online
Article
Text
id pubmed-3627565
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-36275652013-04-17 Fast and accurate read mapping with approximate seeds and multiple backtracking Siragusa, Enrico Weese, David Reinert, Knut Nucleic Acids Res Methods Online We present Masai, a read mapper representing the state-of-the-art in terms of speed and accuracy. Our tool is an order of magnitude faster than RazerS 3 and mrFAST, 2–4 times faster and more accurate than Bowtie 2 and BWA. The novelties of our read mapper are filtration with approximate seeds and a method for multiple backtracking. Approximate seeds, compared with exact seeds, increase filtration specificity while preserving sensitivity. Multiple backtracking amortizes the cost of searching a large set of seeds by taking advantage of the repetitiveness of next-generation sequencing data. Combined together, these two methods significantly speed up approximate search on genomic data sets. Masai is implemented in C++ using the SeqAn library. The source code is distributed under the BSD license and binaries for Linux, Mac OS X and Windows can be freely downloaded from http://www.seqan.de/projects/masai. Oxford University Press 2013-04 2013-01-28 /pmc/articles/PMC3627565/ /pubmed/23358824 http://dx.doi.org/10.1093/nar/gkt005 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Siragusa, Enrico
Weese, David
Reinert, Knut
Fast and accurate read mapping with approximate seeds and multiple backtracking
title Fast and accurate read mapping with approximate seeds and multiple backtracking
title_full Fast and accurate read mapping with approximate seeds and multiple backtracking
title_fullStr Fast and accurate read mapping with approximate seeds and multiple backtracking
title_full_unstemmed Fast and accurate read mapping with approximate seeds and multiple backtracking
title_short Fast and accurate read mapping with approximate seeds and multiple backtracking
title_sort fast and accurate read mapping with approximate seeds and multiple backtracking
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3627565/
https://www.ncbi.nlm.nih.gov/pubmed/23358824
http://dx.doi.org/10.1093/nar/gkt005
work_keys_str_mv AT siragusaenrico fastandaccuratereadmappingwithapproximateseedsandmultiplebacktracking
AT weesedavid fastandaccuratereadmappingwithapproximateseedsandmultiplebacktracking
AT reinertknut fastandaccuratereadmappingwithapproximateseedsandmultiplebacktracking