Cargando…
A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach
BACKGROUND: Yeasts are a model system for exploring eukaryotic genome evolution. Next-generation sequencing technologies are poised to vastly increase the number of yeast genome sequences, both from resequencing projects (population studies) and from de novo sequencing projects (new species). Howeve...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3507789/ https://www.ncbi.nlm.nih.gov/pubmed/22984983 http://dx.doi.org/10.1186/1471-2105-13-237 |
_version_ | 1782251133580869632 |
---|---|
author | Proux-Wéra, Estelle Armisén, David Byrne, Kevin P Wolfe, Kenneth H |
author_facet | Proux-Wéra, Estelle Armisén, David Byrne, Kevin P Wolfe, Kenneth H |
author_sort | Proux-Wéra, Estelle |
collection | PubMed |
description | BACKGROUND: Yeasts are a model system for exploring eukaryotic genome evolution. Next-generation sequencing technologies are poised to vastly increase the number of yeast genome sequences, both from resequencing projects (population studies) and from de novo sequencing projects (new species). However, the annotation of genomes presents a major bottleneck for de novo projects, because it still relies on a process that is largely manual. RESULTS: Here we present the Yeast Genome Annotation Pipeline (YGAP), an automated system designed specifically for new yeast genome sequences lacking transcriptome data. YGAP does automatic de novo annotation, exploiting homology and synteny information from other yeast species stored in the Yeast Gene Order Browser (YGOB) database. The basic premises underlying YGAP's approach are that data from other species already tells us what genes we should expect to find in any particular genomic region and that we should also expect that orthologous genes are likely to have similar intron/exon structures. Additionally, it is able to detect probable frameshift sequencing errors and can propose corrections for them. YGAP searches intelligently for introns, and detects tRNA genes and Ty-like elements. CONCLUSIONS: In tests on Saccharomyces cerevisiae and on the genomes of Naumovozyma castellii and Tetrapisispora blattae newly sequenced with Roche-454 technology, YGAP outperformed another popular annotation program (AUGUSTUS). For S. cerevisiae and N. castellii, 91-93% of YGAP's predicted gene structures were identical to those in previous manually curated gene sets. YGAP has been implemented as a webserver with a user-friendly interface at http://wolfe.gen.tcd.ie/annotation. |
format | Online Article Text |
id | pubmed-3507789 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35077892012-11-28 A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach Proux-Wéra, Estelle Armisén, David Byrne, Kevin P Wolfe, Kenneth H BMC Bioinformatics Research Article BACKGROUND: Yeasts are a model system for exploring eukaryotic genome evolution. Next-generation sequencing technologies are poised to vastly increase the number of yeast genome sequences, both from resequencing projects (population studies) and from de novo sequencing projects (new species). However, the annotation of genomes presents a major bottleneck for de novo projects, because it still relies on a process that is largely manual. RESULTS: Here we present the Yeast Genome Annotation Pipeline (YGAP), an automated system designed specifically for new yeast genome sequences lacking transcriptome data. YGAP does automatic de novo annotation, exploiting homology and synteny information from other yeast species stored in the Yeast Gene Order Browser (YGOB) database. The basic premises underlying YGAP's approach are that data from other species already tells us what genes we should expect to find in any particular genomic region and that we should also expect that orthologous genes are likely to have similar intron/exon structures. Additionally, it is able to detect probable frameshift sequencing errors and can propose corrections for them. YGAP searches intelligently for introns, and detects tRNA genes and Ty-like elements. CONCLUSIONS: In tests on Saccharomyces cerevisiae and on the genomes of Naumovozyma castellii and Tetrapisispora blattae newly sequenced with Roche-454 technology, YGAP outperformed another popular annotation program (AUGUSTUS). For S. cerevisiae and N. castellii, 91-93% of YGAP's predicted gene structures were identical to those in previous manually curated gene sets. YGAP has been implemented as a webserver with a user-friendly interface at http://wolfe.gen.tcd.ie/annotation. BioMed Central 2012-09-17 /pmc/articles/PMC3507789/ /pubmed/22984983 http://dx.doi.org/10.1186/1471-2105-13-237 Text en Copyright ©2012 Proux-Wéra et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Proux-Wéra, Estelle Armisén, David Byrne, Kevin P Wolfe, Kenneth H A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach |
title | A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach |
title_full | A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach |
title_fullStr | A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach |
title_full_unstemmed | A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach |
title_short | A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach |
title_sort | pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3507789/ https://www.ncbi.nlm.nih.gov/pubmed/22984983 http://dx.doi.org/10.1186/1471-2105-13-237 |
work_keys_str_mv | AT prouxweraestelle apipelineforautomatedannotationofyeastgenomesequencesbyaconservedsyntenyapproach AT armisendavid apipelineforautomatedannotationofyeastgenomesequencesbyaconservedsyntenyapproach AT byrnekevinp apipelineforautomatedannotationofyeastgenomesequencesbyaconservedsyntenyapproach AT wolfekennethh apipelineforautomatedannotationofyeastgenomesequencesbyaconservedsyntenyapproach AT prouxweraestelle pipelineforautomatedannotationofyeastgenomesequencesbyaconservedsyntenyapproach AT armisendavid pipelineforautomatedannotationofyeastgenomesequencesbyaconservedsyntenyapproach AT byrnekevinp pipelineforautomatedannotationofyeastgenomesequencesbyaconservedsyntenyapproach AT wolfekennethh pipelineforautomatedannotationofyeastgenomesequencesbyaconservedsyntenyapproach |