Cargando…

GenMap: ultra-fast computation of genome mappability

MOTIVATION: Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-m...

Descripción completa

Detalles Bibliográficos
Autores principales: Pockrandt, Christopher, Alzamel, Mai, Iliopoulos, Costas S, Reinert, Knut
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7320602/
https://www.ncbi.nlm.nih.gov/pubmed/32246826
http://dx.doi.org/10.1093/bioinformatics/btaa222
_version_ 1783551276897796096
author Pockrandt, Christopher
Alzamel, Mai
Iliopoulos, Costas S
Reinert, Knut
author_facet Pockrandt, Christopher
Alzamel, Mai
Iliopoulos, Costas S
Reinert, Knut
author_sort Pockrandt, Christopher
collection PubMed
description MOTIVATION: Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as the reciprocal value of how often this k-mer occurs approximately in the genome, i.e. with up to e mismatches. RESULTS: We present a fast method GenMap to compute the (k, e)-mappability. We extend the mappability algorithm, such that it can also be computed across multiple genomes where a k-mer occurrence is only counted once per genome. This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. GenMap supports different formats such as binary output, wig and bed files as well as csv files to export the location of all approximate k-mers for each genomic position. AVAILABILITY AND IMPLEMENTATION: GenMap can be installed via bioconda. Binaries and C++ source code are available on https://github.com/cpockrandt/genmap.
format Online
Article
Text
id pubmed-7320602
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73206022020-07-01 GenMap: ultra-fast computation of genome mappability Pockrandt, Christopher Alzamel, Mai Iliopoulos, Costas S Reinert, Knut Bioinformatics Original Papers MOTIVATION: Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as the reciprocal value of how often this k-mer occurs approximately in the genome, i.e. with up to e mismatches. RESULTS: We present a fast method GenMap to compute the (k, e)-mappability. We extend the mappability algorithm, such that it can also be computed across multiple genomes where a k-mer occurrence is only counted once per genome. This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. GenMap supports different formats such as binary output, wig and bed files as well as csv files to export the location of all approximate k-mers for each genomic position. AVAILABILITY AND IMPLEMENTATION: GenMap can be installed via bioconda. Binaries and C++ source code are available on https://github.com/cpockrandt/genmap. Oxford University Press 2020-06-15 2020-04-04 /pmc/articles/PMC7320602/ /pubmed/32246826 http://dx.doi.org/10.1093/bioinformatics/btaa222 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Pockrandt, Christopher
Alzamel, Mai
Iliopoulos, Costas S
Reinert, Knut
GenMap: ultra-fast computation of genome mappability
title GenMap: ultra-fast computation of genome mappability
title_full GenMap: ultra-fast computation of genome mappability
title_fullStr GenMap: ultra-fast computation of genome mappability
title_full_unstemmed GenMap: ultra-fast computation of genome mappability
title_short GenMap: ultra-fast computation of genome mappability
title_sort genmap: ultra-fast computation of genome mappability
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7320602/
https://www.ncbi.nlm.nih.gov/pubmed/32246826
http://dx.doi.org/10.1093/bioinformatics/btaa222
work_keys_str_mv AT pockrandtchristopher genmapultrafastcomputationofgenomemappability
AT alzamelmai genmapultrafastcomputationofgenomemappability
AT iliopouloscostass genmapultrafastcomputationofgenomemappability
AT reinertknut genmapultrafastcomputationofgenomemappability