Cargando…
ntHash: recursive nucleotide hashing
Motivation: Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly, k-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181554/ https://www.ncbi.nlm.nih.gov/pubmed/27423894 http://dx.doi.org/10.1093/bioinformatics/btw397 |
_version_ | 1782485730268807168 |
---|---|
author | Mohamadi, Hamid Chu, Justin Vandervalk, Benjamin P. Birol, Inanc |
author_facet | Mohamadi, Hamid Chu, Justin Vandervalk, Benjamin P. Birol, Inanc |
author_sort | Mohamadi, Hamid |
collection | PubMed |
description | Motivation: Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly, k-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact in the field, making bioinformatics applications faster and more efficient. Results: We present ntHash, a hashing algorithm tuned for processing DNA/RNA sequences. It performs the best when calculating hash values for adjacent k-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases. Availability and implementation: ntHash is available online at http://www.bcgsc.ca/platform/bioinfo/software/nthash and is free for academic use. Contacts: hmohamadi@bcgsc.ca or ibirol@bcgsc.ca Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-5181554 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-51815542016-12-27 ntHash: recursive nucleotide hashing Mohamadi, Hamid Chu, Justin Vandervalk, Benjamin P. Birol, Inanc Bioinformatics Applications Notes Motivation: Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly, k-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact in the field, making bioinformatics applications faster and more efficient. Results: We present ntHash, a hashing algorithm tuned for processing DNA/RNA sequences. It performs the best when calculating hash values for adjacent k-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases. Availability and implementation: ntHash is available online at http://www.bcgsc.ca/platform/bioinfo/software/nthash and is free for academic use. Contacts: hmohamadi@bcgsc.ca or ibirol@bcgsc.ca Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2016-11-15 2016-07-16 /pmc/articles/PMC5181554/ /pubmed/27423894 http://dx.doi.org/10.1093/bioinformatics/btw397 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Applications Notes Mohamadi, Hamid Chu, Justin Vandervalk, Benjamin P. Birol, Inanc ntHash: recursive nucleotide hashing |
title | ntHash: recursive nucleotide hashing |
title_full | ntHash: recursive nucleotide hashing |
title_fullStr | ntHash: recursive nucleotide hashing |
title_full_unstemmed | ntHash: recursive nucleotide hashing |
title_short | ntHash: recursive nucleotide hashing |
title_sort | nthash: recursive nucleotide hashing |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181554/ https://www.ncbi.nlm.nih.gov/pubmed/27423894 http://dx.doi.org/10.1093/bioinformatics/btw397 |
work_keys_str_mv | AT mohamadihamid nthashrecursivenucleotidehashing AT chujustin nthashrecursivenucleotidehashing AT vandervalkbenjaminp nthashrecursivenucleotidehashing AT birolinanc nthashrecursivenucleotidehashing |