Cargando…

Protein-to-genome alignment with miniprot

MOTIVATION: Protein-to-genome alignment is critical to annotating genes in non-model organisms. While there are a few tools for this purpose, all of them were developed over 10 years ago and did not incorporate the latest advances in alignment algorithms. They are inefficient and could not keep up w...

Descripción completa

Detalles Bibliográficos
Autor principal: Li, Heng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9869432/
https://www.ncbi.nlm.nih.gov/pubmed/36648328
http://dx.doi.org/10.1093/bioinformatics/btad014
_version_ 1784876770441297920
author Li, Heng
author_facet Li, Heng
author_sort Li, Heng
collection PubMed
description MOTIVATION: Protein-to-genome alignment is critical to annotating genes in non-model organisms. While there are a few tools for this purpose, all of them were developed over 10 years ago and did not incorporate the latest advances in alignment algorithms. They are inefficient and could not keep up with the rapid production of new genomes and quickly growing protein databases. RESULTS: Here, we describe miniprot, a new aligner for mapping protein sequences to a complete genome. Miniprot integrates recent techniques such as k-mer sketch and vectorized dynamic programming. It is tens of times faster than existing tools while achieving comparable accuracy on real data. AVAILABILITY AND IMPLEMENTATION: https://github.com/lh3/miniport.
format Online
Article
Text
id pubmed-9869432
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98694322023-01-23 Protein-to-genome alignment with miniprot Li, Heng Bioinformatics Original Paper MOTIVATION: Protein-to-genome alignment is critical to annotating genes in non-model organisms. While there are a few tools for this purpose, all of them were developed over 10 years ago and did not incorporate the latest advances in alignment algorithms. They are inefficient and could not keep up with the rapid production of new genomes and quickly growing protein databases. RESULTS: Here, we describe miniprot, a new aligner for mapping protein sequences to a complete genome. Miniprot integrates recent techniques such as k-mer sketch and vectorized dynamic programming. It is tens of times faster than existing tools while achieving comparable accuracy on real data. AVAILABILITY AND IMPLEMENTATION: https://github.com/lh3/miniport. Oxford University Press 2023-01-17 /pmc/articles/PMC9869432/ /pubmed/36648328 http://dx.doi.org/10.1093/bioinformatics/btad014 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Li, Heng
Protein-to-genome alignment with miniprot
title Protein-to-genome alignment with miniprot
title_full Protein-to-genome alignment with miniprot
title_fullStr Protein-to-genome alignment with miniprot
title_full_unstemmed Protein-to-genome alignment with miniprot
title_short Protein-to-genome alignment with miniprot
title_sort protein-to-genome alignment with miniprot
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9869432/
https://www.ncbi.nlm.nih.gov/pubmed/36648328
http://dx.doi.org/10.1093/bioinformatics/btad014
work_keys_str_mv AT liheng proteintogenomealignmentwithminiprot