Cargando…

Automated alignment-based curation of gene models in filamentous fungi

BACKGROUND: Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The...

Descripción completa

Detalles Bibliográficos
Autores principales: van der Burgt, Ate, Severing, Edouard, Collemare, Jérôme, de Wit, Pierre JGM
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3898260/
https://www.ncbi.nlm.nih.gov/pubmed/24433567
http://dx.doi.org/10.1186/1471-2105-15-19
_version_ 1782300393627189248
author van der Burgt, Ate
Severing, Edouard
Collemare, Jérôme
de Wit, Pierre JGM
author_facet van der Burgt, Ate
Severing, Edouard
Collemare, Jérôme
de Wit, Pierre JGM
author_sort van der Burgt, Ate
collection PubMed
description BACKGROUND: Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations. RESULTS: We provide a novel method named alignment-based fungal gene prediction (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It can assess gene models on a gene-by-gene basis making use of informant gene loci. Its performance was benchmarked on 6,965 gene models confirmed by full-length unigenes from ten different fungi. 79.4% of all gene models were correctly predicted by ABFGP. It improves the output of ab initio gene prediction software due to a higher sensitivity and precision for all gene model components. Applicability of the method was shown by revisiting the annotations of six different fungi, using gene loci from up to 29 fungal genomes as informants. Between 7,231 and 8,337 genes were assessed by ABFGP and for each genome between 1,724 and 3,505 gene model revisions were proposed. The reliability of the proposed gene models is assessed by an a posteriori introspection procedure of each intron and exon in the multiple gene model alignment. The total number and type of proposed gene model revisions in the six fungal genomes is correlated to the quality of the genome assembly, and to sequencing strategies used in the sequencing centre, highlighting different types of errors in different annotation pipelines. The ABFGP method is particularly successful in discovering sequence errors and/or disruptive mutations causing truncated and erroneous gene models. CONCLUSIONS: The ABFGP method is an accurate and fully automated quality control method for fungal gene catalogues that can be easily implemented into existing annotation pipelines. With the exponential release of new genomes, the ABFGP method will help decreasing the number of gene models that require additional manual curation.
format Online
Article
Text
id pubmed-3898260
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38982602014-01-23 Automated alignment-based curation of gene models in filamentous fungi van der Burgt, Ate Severing, Edouard Collemare, Jérôme de Wit, Pierre JGM BMC Bioinformatics Methodology Article BACKGROUND: Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations. RESULTS: We provide a novel method named alignment-based fungal gene prediction (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It can assess gene models on a gene-by-gene basis making use of informant gene loci. Its performance was benchmarked on 6,965 gene models confirmed by full-length unigenes from ten different fungi. 79.4% of all gene models were correctly predicted by ABFGP. It improves the output of ab initio gene prediction software due to a higher sensitivity and precision for all gene model components. Applicability of the method was shown by revisiting the annotations of six different fungi, using gene loci from up to 29 fungal genomes as informants. Between 7,231 and 8,337 genes were assessed by ABFGP and for each genome between 1,724 and 3,505 gene model revisions were proposed. The reliability of the proposed gene models is assessed by an a posteriori introspection procedure of each intron and exon in the multiple gene model alignment. The total number and type of proposed gene model revisions in the six fungal genomes is correlated to the quality of the genome assembly, and to sequencing strategies used in the sequencing centre, highlighting different types of errors in different annotation pipelines. The ABFGP method is particularly successful in discovering sequence errors and/or disruptive mutations causing truncated and erroneous gene models. CONCLUSIONS: The ABFGP method is an accurate and fully automated quality control method for fungal gene catalogues that can be easily implemented into existing annotation pipelines. With the exponential release of new genomes, the ABFGP method will help decreasing the number of gene models that require additional manual curation. BioMed Central 2014-01-16 /pmc/articles/PMC3898260/ /pubmed/24433567 http://dx.doi.org/10.1186/1471-2105-15-19 Text en Copyright © 2014 van der Burgt et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
van der Burgt, Ate
Severing, Edouard
Collemare, Jérôme
de Wit, Pierre JGM
Automated alignment-based curation of gene models in filamentous fungi
title Automated alignment-based curation of gene models in filamentous fungi
title_full Automated alignment-based curation of gene models in filamentous fungi
title_fullStr Automated alignment-based curation of gene models in filamentous fungi
title_full_unstemmed Automated alignment-based curation of gene models in filamentous fungi
title_short Automated alignment-based curation of gene models in filamentous fungi
title_sort automated alignment-based curation of gene models in filamentous fungi
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3898260/
https://www.ncbi.nlm.nih.gov/pubmed/24433567
http://dx.doi.org/10.1186/1471-2105-15-19
work_keys_str_mv AT vanderburgtate automatedalignmentbasedcurationofgenemodelsinfilamentousfungi
AT severingedouard automatedalignmentbasedcurationofgenemodelsinfilamentousfungi
AT collemarejerome automatedalignmentbasedcurationofgenemodelsinfilamentousfungi
AT dewitpierrejgm automatedalignmentbasedcurationofgenemodelsinfilamentousfungi