Cargando…
The K-mer File Format: a standardized and compact disk representation of sets of k-mers
SUMMARY: Bioinformatics applications increasingly rely on ad hoc disk storage of k-mer sets, e.g. for de Bruijn graphs or alignment indexes. Here, we introduce the K-mer File Format as a general lossless framework for storing and manipulating k-mer sets, realizing space savings of 3–5× compared to o...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9477520/ https://www.ncbi.nlm.nih.gov/pubmed/35904548 http://dx.doi.org/10.1093/bioinformatics/btac528 |
_version_ | 1784790379291213824 |
---|---|
author | Dufresne, Yoann Lemane, Teo Marijon, Pierre Peterlongo, Pierre Rahman, Amatur Kokot, Marek Medvedev, Paul Deorowicz, Sebastian Chikhi, Rayan |
author_facet | Dufresne, Yoann Lemane, Teo Marijon, Pierre Peterlongo, Pierre Rahman, Amatur Kokot, Marek Medvedev, Paul Deorowicz, Sebastian Chikhi, Rayan |
author_sort | Dufresne, Yoann |
collection | PubMed |
description | SUMMARY: Bioinformatics applications increasingly rely on ad hoc disk storage of k-mer sets, e.g. for de Bruijn graphs or alignment indexes. Here, we introduce the K-mer File Format as a general lossless framework for storing and manipulating k-mer sets, realizing space savings of 3–5× compared to other formats, and bringing interoperability across tools. AVAILABILITY AND IMPLEMENTATION: Format specification, C++/Rust API, tools: https://github.com/Kmer-File-Format/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9477520 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-94775202022-09-19 The K-mer File Format: a standardized and compact disk representation of sets of k-mers Dufresne, Yoann Lemane, Teo Marijon, Pierre Peterlongo, Pierre Rahman, Amatur Kokot, Marek Medvedev, Paul Deorowicz, Sebastian Chikhi, Rayan Bioinformatics Applications Note SUMMARY: Bioinformatics applications increasingly rely on ad hoc disk storage of k-mer sets, e.g. for de Bruijn graphs or alignment indexes. Here, we introduce the K-mer File Format as a general lossless framework for storing and manipulating k-mer sets, realizing space savings of 3–5× compared to other formats, and bringing interoperability across tools. AVAILABILITY AND IMPLEMENTATION: Format specification, C++/Rust API, tools: https://github.com/Kmer-File-Format/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-07-29 /pmc/articles/PMC9477520/ /pubmed/35904548 http://dx.doi.org/10.1093/bioinformatics/btac528 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Dufresne, Yoann Lemane, Teo Marijon, Pierre Peterlongo, Pierre Rahman, Amatur Kokot, Marek Medvedev, Paul Deorowicz, Sebastian Chikhi, Rayan The K-mer File Format: a standardized and compact disk representation of sets of k-mers |
title | The K-mer File Format: a standardized and compact disk representation of sets of k-mers |
title_full | The K-mer File Format: a standardized and compact disk representation of sets of k-mers |
title_fullStr | The K-mer File Format: a standardized and compact disk representation of sets of k-mers |
title_full_unstemmed | The K-mer File Format: a standardized and compact disk representation of sets of k-mers |
title_short | The K-mer File Format: a standardized and compact disk representation of sets of k-mers |
title_sort | k-mer file format: a standardized and compact disk representation of sets of k-mers |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9477520/ https://www.ncbi.nlm.nih.gov/pubmed/35904548 http://dx.doi.org/10.1093/bioinformatics/btac528 |
work_keys_str_mv | AT dufresneyoann thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT lemaneteo thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT marijonpierre thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT peterlongopierre thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT rahmanamatur thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT kokotmarek thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT medvedevpaul thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT deorowiczsebastian thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT chikhirayan thekmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT dufresneyoann kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT lemaneteo kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT marijonpierre kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT peterlongopierre kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT rahmanamatur kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT kokotmarek kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT medvedevpaul kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT deorowiczsebastian kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers AT chikhirayan kmerfileformatastandardizedandcompactdiskrepresentationofsetsofkmers |