Cargando…
Galba: genome annotation with miniprot and AUGUSTUS
BACKGROUND: The Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes. RESULTS: Various gene annotation tools have been...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10472564/ https://www.ncbi.nlm.nih.gov/pubmed/37653395 http://dx.doi.org/10.1186/s12859-023-05449-z |
_version_ | 1785100104068235264 |
---|---|
author | Brůna, Tomáš Li, Heng Guhlin, Joseph Honsel, Daniel Herbold, Steffen Stanke, Mario Nenasheva, Natalia Ebel, Matthis Gabriel, Lars Hoff, Katharina J. |
author_facet | Brůna, Tomáš Li, Heng Guhlin, Joseph Honsel, Daniel Herbold, Steffen Stanke, Mario Nenasheva, Natalia Ebel, Matthis Gabriel, Lars Hoff, Katharina J. |
author_sort | Brůna, Tomáš |
collection | PubMed |
description | BACKGROUND: The Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes. RESULTS: Various gene annotation tools have been developed but each has its limitations. Here, we introduce GALBA, a fully automated pipeline that utilizes miniprot, a rapid protein-to-genome aligner, in combination with AUGUSTUS to predict genes with high accuracy. Accuracy results indicate that GALBA is particularly strong in the annotation of large vertebrate genomes. We also present use cases in insects, vertebrates, and a land plant. GALBA is fully open source and available as a docker image for easy execution with Singularity in high-performance computing environments. CONCLUSIONS: Our pipeline addresses the critical need for accurate gene annotation in newly sequenced genomes, and we believe that GALBA will greatly facilitate genome annotation for diverse organisms. |
format | Online Article Text |
id | pubmed-10472564 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-104725642023-09-02 Galba: genome annotation with miniprot and AUGUSTUS Brůna, Tomáš Li, Heng Guhlin, Joseph Honsel, Daniel Herbold, Steffen Stanke, Mario Nenasheva, Natalia Ebel, Matthis Gabriel, Lars Hoff, Katharina J. BMC Bioinformatics Research BACKGROUND: The Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes. RESULTS: Various gene annotation tools have been developed but each has its limitations. Here, we introduce GALBA, a fully automated pipeline that utilizes miniprot, a rapid protein-to-genome aligner, in combination with AUGUSTUS to predict genes with high accuracy. Accuracy results indicate that GALBA is particularly strong in the annotation of large vertebrate genomes. We also present use cases in insects, vertebrates, and a land plant. GALBA is fully open source and available as a docker image for easy execution with Singularity in high-performance computing environments. CONCLUSIONS: Our pipeline addresses the critical need for accurate gene annotation in newly sequenced genomes, and we believe that GALBA will greatly facilitate genome annotation for diverse organisms. BioMed Central 2023-08-31 /pmc/articles/PMC10472564/ /pubmed/37653395 http://dx.doi.org/10.1186/s12859-023-05449-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Brůna, Tomáš Li, Heng Guhlin, Joseph Honsel, Daniel Herbold, Steffen Stanke, Mario Nenasheva, Natalia Ebel, Matthis Gabriel, Lars Hoff, Katharina J. Galba: genome annotation with miniprot and AUGUSTUS |
title | Galba: genome annotation with miniprot and AUGUSTUS |
title_full | Galba: genome annotation with miniprot and AUGUSTUS |
title_fullStr | Galba: genome annotation with miniprot and AUGUSTUS |
title_full_unstemmed | Galba: genome annotation with miniprot and AUGUSTUS |
title_short | Galba: genome annotation with miniprot and AUGUSTUS |
title_sort | galba: genome annotation with miniprot and augustus |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10472564/ https://www.ncbi.nlm.nih.gov/pubmed/37653395 http://dx.doi.org/10.1186/s12859-023-05449-z |
work_keys_str_mv | AT brunatomas galbagenomeannotationwithminiprotandaugustus AT liheng galbagenomeannotationwithminiprotandaugustus AT guhlinjoseph galbagenomeannotationwithminiprotandaugustus AT honseldaniel galbagenomeannotationwithminiprotandaugustus AT herboldsteffen galbagenomeannotationwithminiprotandaugustus AT stankemario galbagenomeannotationwithminiprotandaugustus AT nenashevanatalia galbagenomeannotationwithminiprotandaugustus AT ebelmatthis galbagenomeannotationwithminiprotandaugustus AT gabriellars galbagenomeannotationwithminiprotandaugustus AT hoffkatharinaj galbagenomeannotationwithminiprotandaugustus |