Cargando…
Automated alignment-based curation of gene models in filamentous fungi
BACKGROUND: Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3898260/ https://www.ncbi.nlm.nih.gov/pubmed/24433567 http://dx.doi.org/10.1186/1471-2105-15-19 |
_version_ | 1782300393627189248 |
---|---|
author | van der Burgt, Ate Severing, Edouard Collemare, Jérôme de Wit, Pierre JGM |
author_facet | van der Burgt, Ate Severing, Edouard Collemare, Jérôme de Wit, Pierre JGM |
author_sort | van der Burgt, Ate |
collection | PubMed |
description | BACKGROUND: Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations. RESULTS: We provide a novel method named alignment-based fungal gene prediction (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It can assess gene models on a gene-by-gene basis making use of informant gene loci. Its performance was benchmarked on 6,965 gene models confirmed by full-length unigenes from ten different fungi. 79.4% of all gene models were correctly predicted by ABFGP. It improves the output of ab initio gene prediction software due to a higher sensitivity and precision for all gene model components. Applicability of the method was shown by revisiting the annotations of six different fungi, using gene loci from up to 29 fungal genomes as informants. Between 7,231 and 8,337 genes were assessed by ABFGP and for each genome between 1,724 and 3,505 gene model revisions were proposed. The reliability of the proposed gene models is assessed by an a posteriori introspection procedure of each intron and exon in the multiple gene model alignment. The total number and type of proposed gene model revisions in the six fungal genomes is correlated to the quality of the genome assembly, and to sequencing strategies used in the sequencing centre, highlighting different types of errors in different annotation pipelines. The ABFGP method is particularly successful in discovering sequence errors and/or disruptive mutations causing truncated and erroneous gene models. CONCLUSIONS: The ABFGP method is an accurate and fully automated quality control method for fungal gene catalogues that can be easily implemented into existing annotation pipelines. With the exponential release of new genomes, the ABFGP method will help decreasing the number of gene models that require additional manual curation. |
format | Online Article Text |
id | pubmed-3898260 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-38982602014-01-23 Automated alignment-based curation of gene models in filamentous fungi van der Burgt, Ate Severing, Edouard Collemare, Jérôme de Wit, Pierre JGM BMC Bioinformatics Methodology Article BACKGROUND: Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations. RESULTS: We provide a novel method named alignment-based fungal gene prediction (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It can assess gene models on a gene-by-gene basis making use of informant gene loci. Its performance was benchmarked on 6,965 gene models confirmed by full-length unigenes from ten different fungi. 79.4% of all gene models were correctly predicted by ABFGP. It improves the output of ab initio gene prediction software due to a higher sensitivity and precision for all gene model components. Applicability of the method was shown by revisiting the annotations of six different fungi, using gene loci from up to 29 fungal genomes as informants. Between 7,231 and 8,337 genes were assessed by ABFGP and for each genome between 1,724 and 3,505 gene model revisions were proposed. The reliability of the proposed gene models is assessed by an a posteriori introspection procedure of each intron and exon in the multiple gene model alignment. The total number and type of proposed gene model revisions in the six fungal genomes is correlated to the quality of the genome assembly, and to sequencing strategies used in the sequencing centre, highlighting different types of errors in different annotation pipelines. The ABFGP method is particularly successful in discovering sequence errors and/or disruptive mutations causing truncated and erroneous gene models. CONCLUSIONS: The ABFGP method is an accurate and fully automated quality control method for fungal gene catalogues that can be easily implemented into existing annotation pipelines. With the exponential release of new genomes, the ABFGP method will help decreasing the number of gene models that require additional manual curation. BioMed Central 2014-01-16 /pmc/articles/PMC3898260/ /pubmed/24433567 http://dx.doi.org/10.1186/1471-2105-15-19 Text en Copyright © 2014 van der Burgt et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article van der Burgt, Ate Severing, Edouard Collemare, Jérôme de Wit, Pierre JGM Automated alignment-based curation of gene models in filamentous fungi |
title | Automated alignment-based curation of gene models in filamentous fungi |
title_full | Automated alignment-based curation of gene models in filamentous fungi |
title_fullStr | Automated alignment-based curation of gene models in filamentous fungi |
title_full_unstemmed | Automated alignment-based curation of gene models in filamentous fungi |
title_short | Automated alignment-based curation of gene models in filamentous fungi |
title_sort | automated alignment-based curation of gene models in filamentous fungi |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3898260/ https://www.ncbi.nlm.nih.gov/pubmed/24433567 http://dx.doi.org/10.1186/1471-2105-15-19 |
work_keys_str_mv | AT vanderburgtate automatedalignmentbasedcurationofgenemodelsinfilamentousfungi AT severingedouard automatedalignmentbasedcurationofgenemodelsinfilamentousfungi AT collemarejerome automatedalignmentbasedcurationofgenemodelsinfilamentousfungi AT dewitpierrejgm automatedalignmentbasedcurationofgenemodelsinfilamentousfungi |