Cargando…
Experimental validation of novel genes predicted in the un-annotated regions of the Arabidopsis genome
BACKGROUND: Several lines of evidence support the existence of novel genes and other transcribed units which have not yet been annotated in the Arabidopsis genome. Two gene prediction programs which make use of comparative genomic analysis, Twinscan and EuGene, have recently been deployed on the Ara...
Autores principales: | , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1783852/ https://www.ncbi.nlm.nih.gov/pubmed/17229318 http://dx.doi.org/10.1186/1471-2164-8-18 |
_version_ | 1782132022869753856 |
---|---|
author | Moskal, William A Wu, Hank C Underwood, Beverly A Wang, Wei Town, Christopher D Xiao, Yongli |
author_facet | Moskal, William A Wu, Hank C Underwood, Beverly A Wang, Wei Town, Christopher D Xiao, Yongli |
author_sort | Moskal, William A |
collection | PubMed |
description | BACKGROUND: Several lines of evidence support the existence of novel genes and other transcribed units which have not yet been annotated in the Arabidopsis genome. Two gene prediction programs which make use of comparative genomic analysis, Twinscan and EuGene, have recently been deployed on the Arabidopsis genome. The ability of these programs to make use of sequence data from other species has allowed both Twinscan and EuGene to predict over 1000 genes that are intergenic with respect to the most recent annotation release. A high throughput RACE pipeline was utilized in an attempt to verify the structure and expression of these novel genes. RESULTS: 1,071 un-annotated loci were targeted by RACE, and full length sequence coverage was obtained for 35% of the targeted genes. We have verified the structure and expression of 378 genes that were not present within the most recent release of the Arabidopsis genome annotation. These 378 genes represent a structurally diverse set of transcripts and encode a functionally diverse set of proteins. CONCLUSION: We have investigated the accuracy of the Twinscan and EuGene gene prediction programs and found them to be reliable predictors of gene structure in Arabidopsis. Several hundred previously un-annotated genes were validated by this work. Based upon this information derived from these efforts it is likely that the Arabidopsis genome annotation continues to overlook several hundred protein coding genes. |
format | Text |
id | pubmed-1783852 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-17838522007-01-30 Experimental validation of novel genes predicted in the un-annotated regions of the Arabidopsis genome Moskal, William A Wu, Hank C Underwood, Beverly A Wang, Wei Town, Christopher D Xiao, Yongli BMC Genomics Research Article BACKGROUND: Several lines of evidence support the existence of novel genes and other transcribed units which have not yet been annotated in the Arabidopsis genome. Two gene prediction programs which make use of comparative genomic analysis, Twinscan and EuGene, have recently been deployed on the Arabidopsis genome. The ability of these programs to make use of sequence data from other species has allowed both Twinscan and EuGene to predict over 1000 genes that are intergenic with respect to the most recent annotation release. A high throughput RACE pipeline was utilized in an attempt to verify the structure and expression of these novel genes. RESULTS: 1,071 un-annotated loci were targeted by RACE, and full length sequence coverage was obtained for 35% of the targeted genes. We have verified the structure and expression of 378 genes that were not present within the most recent release of the Arabidopsis genome annotation. These 378 genes represent a structurally diverse set of transcripts and encode a functionally diverse set of proteins. CONCLUSION: We have investigated the accuracy of the Twinscan and EuGene gene prediction programs and found them to be reliable predictors of gene structure in Arabidopsis. Several hundred previously un-annotated genes were validated by this work. Based upon this information derived from these efforts it is likely that the Arabidopsis genome annotation continues to overlook several hundred protein coding genes. BioMed Central 2007-01-17 /pmc/articles/PMC1783852/ /pubmed/17229318 http://dx.doi.org/10.1186/1471-2164-8-18 Text en Copyright © 2007 Moskal et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Moskal, William A Wu, Hank C Underwood, Beverly A Wang, Wei Town, Christopher D Xiao, Yongli Experimental validation of novel genes predicted in the un-annotated regions of the Arabidopsis genome |
title | Experimental validation of novel genes predicted in the un-annotated regions of the Arabidopsis genome |
title_full | Experimental validation of novel genes predicted in the un-annotated regions of the Arabidopsis genome |
title_fullStr | Experimental validation of novel genes predicted in the un-annotated regions of the Arabidopsis genome |
title_full_unstemmed | Experimental validation of novel genes predicted in the un-annotated regions of the Arabidopsis genome |
title_short | Experimental validation of novel genes predicted in the un-annotated regions of the Arabidopsis genome |
title_sort | experimental validation of novel genes predicted in the un-annotated regions of the arabidopsis genome |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1783852/ https://www.ncbi.nlm.nih.gov/pubmed/17229318 http://dx.doi.org/10.1186/1471-2164-8-18 |
work_keys_str_mv | AT moskalwilliama experimentalvalidationofnovelgenespredictedintheunannotatedregionsofthearabidopsisgenome AT wuhankc experimentalvalidationofnovelgenespredictedintheunannotatedregionsofthearabidopsisgenome AT underwoodbeverlya experimentalvalidationofnovelgenespredictedintheunannotatedregionsofthearabidopsisgenome AT wangwei experimentalvalidationofnovelgenespredictedintheunannotatedregionsofthearabidopsisgenome AT townchristopherd experimentalvalidationofnovelgenespredictedintheunannotatedregionsofthearabidopsisgenome AT xiaoyongli experimentalvalidationofnovelgenespredictedintheunannotatedregionsofthearabidopsisgenome |