Cargando…
BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database
The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787252/ https://www.ncbi.nlm.nih.gov/pubmed/33575650 http://dx.doi.org/10.1093/nargab/lqaa108 |
_version_ | 1783632789121269760 |
---|---|
author | Brůna, Tomáš Hoff, Katharina J Lomsadze, Alexandre Stanke, Mario Borodovsky, Mark |
author_facet | Brůna, Tomáš Hoff, Katharina J Lomsadze, Alexandre Stanke, Mario Borodovsky, Mark |
author_sort | Brůna, Tomáš |
collection | PubMed |
description | The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject for further investigation. The new BRAKER2 pipeline generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS. BRAKER2 continues the line started by BRAKER1 where self-training GeneMark-ET and AUGUSTUS made gene predictions supported by transcriptomic data. Among the challenges addressed by the new pipeline was a generation of reliable hints to protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. In comparison with other pipelines for eukaryotic genome annotation, BRAKER2 is fully automatic. It is favorably compared under equal conditions with other pipelines, e.g. MAKER2, in terms of accuracy and performance. Development of BRAKER2 should facilitate solving the task of harmonization of annotation of protein-coding genes in genomes of different eukaryotic species. However, we fully understand that several more innovations are needed in transcriptomic and proteomic technologies as well as in algorithmic development to reach the goal of highly accurate annotation of eukaryotic genomes. |
format | Online Article Text |
id | pubmed-7787252 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-77872522021-02-10 BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database Brůna, Tomáš Hoff, Katharina J Lomsadze, Alexandre Stanke, Mario Borodovsky, Mark NAR Genom Bioinform Standard Article The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject for further investigation. The new BRAKER2 pipeline generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS. BRAKER2 continues the line started by BRAKER1 where self-training GeneMark-ET and AUGUSTUS made gene predictions supported by transcriptomic data. Among the challenges addressed by the new pipeline was a generation of reliable hints to protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. In comparison with other pipelines for eukaryotic genome annotation, BRAKER2 is fully automatic. It is favorably compared under equal conditions with other pipelines, e.g. MAKER2, in terms of accuracy and performance. Development of BRAKER2 should facilitate solving the task of harmonization of annotation of protein-coding genes in genomes of different eukaryotic species. However, we fully understand that several more innovations are needed in transcriptomic and proteomic technologies as well as in algorithmic development to reach the goal of highly accurate annotation of eukaryotic genomes. Oxford University Press 2021-01-06 /pmc/articles/PMC7787252/ /pubmed/33575650 http://dx.doi.org/10.1093/nargab/lqaa108 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Standard Article Brůna, Tomáš Hoff, Katharina J Lomsadze, Alexandre Stanke, Mario Borodovsky, Mark BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database |
title | BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database |
title_full | BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database |
title_fullStr | BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database |
title_full_unstemmed | BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database |
title_short | BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database |
title_sort | braker2: automatic eukaryotic genome annotation with genemark-ep+ and augustus supported by a protein database |
topic | Standard Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787252/ https://www.ncbi.nlm.nih.gov/pubmed/33575650 http://dx.doi.org/10.1093/nargab/lqaa108 |
work_keys_str_mv | AT brunatomas braker2automaticeukaryoticgenomeannotationwithgenemarkepandaugustussupportedbyaproteindatabase AT hoffkatharinaj braker2automaticeukaryoticgenomeannotationwithgenemarkepandaugustussupportedbyaproteindatabase AT lomsadzealexandre braker2automaticeukaryoticgenomeannotationwithgenemarkepandaugustussupportedbyaproteindatabase AT stankemario braker2automaticeukaryoticgenomeannotationwithgenemarkepandaugustussupportedbyaproteindatabase AT borodovskymark braker2automaticeukaryoticgenomeannotationwithgenemarkepandaugustussupportedbyaproteindatabase |