Cargando…
Re-annotation of the woodland strawberry (Fragaria vesca) genome
BACKGROUND: Fragaria vesca is a low-growing, small-fruited diploid strawberry species commonly called woodland strawberry. It is native to temperate regions of Eurasia and North America and while it produces edible fruits, it is most highly useful as an experimental perennial plant system that can s...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4318131/ https://www.ncbi.nlm.nih.gov/pubmed/25623424 http://dx.doi.org/10.1186/s12864-015-1221-1 |
_version_ | 1782355806006542336 |
---|---|
author | Darwish, Omar Shahan, Rachel Liu, Zhongchi Slovin, Janet P Alkharouf, Nadim W |
author_facet | Darwish, Omar Shahan, Rachel Liu, Zhongchi Slovin, Janet P Alkharouf, Nadim W |
author_sort | Darwish, Omar |
collection | PubMed |
description | BACKGROUND: Fragaria vesca is a low-growing, small-fruited diploid strawberry species commonly called woodland strawberry. It is native to temperate regions of Eurasia and North America and while it produces edible fruits, it is most highly useful as an experimental perennial plant system that can serve as a model for the agriculturally important Rosaceae family. A draft of the F. vesca genome sequence was published in 2011 [Nat Genet 43:223,2011]. The first generation annotation (version 1.1) were developed using GeneMark-ES+[Nuc Acids Res 33:6494,2005]which is a self-training gene prediction tool that relies primarily on the combination of ab initio predictions with mapping high confidence ESTs in addition to mapping gene deserts from transposable elements. Based on over 25 different tissue transcriptomes, we have revised the F. vesca genome annotation, thereby providing several improvements over version 1.1. RESULTS: The new annotation, which was achieved using Maker, describes many more predicted protein coding genes compared to the GeneMark generated annotation that is currently hosted at the Genome Database for Rosaceae (http://www.rosaceae.org/). Our new annotation also results in an increase in the overall total coding length, and the number of coding regions found. The total number of gene predictions that do not overlap with the previous annotations is 2286, most of which were found to be homologous to other plant genes. We have experimentally verified one of the new gene model predictions to validate our results. CONCLUSIONS: Using the RNA-Seq transcriptome sequences from 25 diverse tissue types, the re-annotation pipeline improved existing annotations by increasing the annotation accuracy based on extensive transcriptome data. It uncovered new genes, added exons to current genes, and extended or merged exons. This complete genome re-annotation will significantly benefit functional genomic studies of the strawberry and other members of the Rosaceae. |
format | Online Article Text |
id | pubmed-4318131 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-43181312015-02-06 Re-annotation of the woodland strawberry (Fragaria vesca) genome Darwish, Omar Shahan, Rachel Liu, Zhongchi Slovin, Janet P Alkharouf, Nadim W BMC Genomics Research Article BACKGROUND: Fragaria vesca is a low-growing, small-fruited diploid strawberry species commonly called woodland strawberry. It is native to temperate regions of Eurasia and North America and while it produces edible fruits, it is most highly useful as an experimental perennial plant system that can serve as a model for the agriculturally important Rosaceae family. A draft of the F. vesca genome sequence was published in 2011 [Nat Genet 43:223,2011]. The first generation annotation (version 1.1) were developed using GeneMark-ES+[Nuc Acids Res 33:6494,2005]which is a self-training gene prediction tool that relies primarily on the combination of ab initio predictions with mapping high confidence ESTs in addition to mapping gene deserts from transposable elements. Based on over 25 different tissue transcriptomes, we have revised the F. vesca genome annotation, thereby providing several improvements over version 1.1. RESULTS: The new annotation, which was achieved using Maker, describes many more predicted protein coding genes compared to the GeneMark generated annotation that is currently hosted at the Genome Database for Rosaceae (http://www.rosaceae.org/). Our new annotation also results in an increase in the overall total coding length, and the number of coding regions found. The total number of gene predictions that do not overlap with the previous annotations is 2286, most of which were found to be homologous to other plant genes. We have experimentally verified one of the new gene model predictions to validate our results. CONCLUSIONS: Using the RNA-Seq transcriptome sequences from 25 diverse tissue types, the re-annotation pipeline improved existing annotations by increasing the annotation accuracy based on extensive transcriptome data. It uncovered new genes, added exons to current genes, and extended or merged exons. This complete genome re-annotation will significantly benefit functional genomic studies of the strawberry and other members of the Rosaceae. BioMed Central 2015-01-27 /pmc/articles/PMC4318131/ /pubmed/25623424 http://dx.doi.org/10.1186/s12864-015-1221-1 Text en © Darwish et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Darwish, Omar Shahan, Rachel Liu, Zhongchi Slovin, Janet P Alkharouf, Nadim W Re-annotation of the woodland strawberry (Fragaria vesca) genome |
title | Re-annotation of the woodland strawberry (Fragaria vesca) genome |
title_full | Re-annotation of the woodland strawberry (Fragaria vesca) genome |
title_fullStr | Re-annotation of the woodland strawberry (Fragaria vesca) genome |
title_full_unstemmed | Re-annotation of the woodland strawberry (Fragaria vesca) genome |
title_short | Re-annotation of the woodland strawberry (Fragaria vesca) genome |
title_sort | re-annotation of the woodland strawberry (fragaria vesca) genome |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4318131/ https://www.ncbi.nlm.nih.gov/pubmed/25623424 http://dx.doi.org/10.1186/s12864-015-1221-1 |
work_keys_str_mv | AT darwishomar reannotationofthewoodlandstrawberryfragariavescagenome AT shahanrachel reannotationofthewoodlandstrawberryfragariavescagenome AT liuzhongchi reannotationofthewoodlandstrawberryfragariavescagenome AT slovinjanetp reannotationofthewoodlandstrawberryfragariavescagenome AT alkharoufnadimw reannotationofthewoodlandstrawberryfragariavescagenome |