Cargando…

Identification and characterization of pseudogenes in the rice gene complement

BACKGROUND: The Osa1 Genome Annotation of rice (Oryza sativa L. ssp. japonica cv. Nipponbare) is the product of a semi-automated pipeline that does not explicitly predict pseudogenes. As such, it is likely to mis-annotate pseudogenes as functional genes. A total of 22,033 gene models within the Osa1...

Descripción completa

Detalles Bibliográficos
Autores principales:	Thibaud-Nissen, Françoise, Ouyang, Shu, Buell, C Robin
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2724416/ https://www.ncbi.nlm.nih.gov/pubmed/19607679 http://dx.doi.org/10.1186/1471-2164-10-317

_version_	1782170412712460288
author	Thibaud-Nissen, Françoise Ouyang, Shu Buell, C Robin
author_facet	Thibaud-Nissen, Françoise Ouyang, Shu Buell, C Robin
author_sort	Thibaud-Nissen, Françoise
collection	PubMed
description	BACKGROUND: The Osa1 Genome Annotation of rice (Oryza sativa L. ssp. japonica cv. Nipponbare) is the product of a semi-automated pipeline that does not explicitly predict pseudogenes. As such, it is likely to mis-annotate pseudogenes as functional genes. A total of 22,033 gene models within the Osa1 Release 5 were investigated as potential pseudogenes as these genes exhibit at least one feature potentially indicative of pseudogenes: lack of transcript support, short coding region, long untranslated region, or, for genes residing within a segmentally duplicated region, lack of a paralog or significantly shorter corresponding paralog. RESULTS: A total of 1,439 pseudogenes, identified among genes with pseudogene features, were characterized by similarity to fully-supported gene models and the presence of frameshifts or premature translational stop codons. Significant difference in the length of duplicated genes within segmentally-duplicated regions was the optimal indicator of pseudogenization. Among the 816 pseudogenes for which a probable origin could be determined, 75% originated from gene duplication events while 25% were the result of retrotransposition events. A total of 12% of the pseudogenes were expressed. Finally, F-box proteins, BTB/POZ proteins, terpene synthases, chalcone synthases and cytochrome P450 protein families were found to harbor large numbers of pseudogenes. CONCLUSION: These pseudogenes still have a detectable open reading frame and are thus distinct from pseudogenes detected within intergenic regions which typically lack definable open reading frames. Families containing the highest number of pseudogenes are fast-evolving families involved in ubiquitination and secondary metabolism.
format	Text
id	pubmed-2724416
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-27244162009-08-11 Identification and characterization of pseudogenes in the rice gene complement Thibaud-Nissen, Françoise Ouyang, Shu Buell, C Robin BMC Genomics Research Article BACKGROUND: The Osa1 Genome Annotation of rice (Oryza sativa L. ssp. japonica cv. Nipponbare) is the product of a semi-automated pipeline that does not explicitly predict pseudogenes. As such, it is likely to mis-annotate pseudogenes as functional genes. A total of 22,033 gene models within the Osa1 Release 5 were investigated as potential pseudogenes as these genes exhibit at least one feature potentially indicative of pseudogenes: lack of transcript support, short coding region, long untranslated region, or, for genes residing within a segmentally duplicated region, lack of a paralog or significantly shorter corresponding paralog. RESULTS: A total of 1,439 pseudogenes, identified among genes with pseudogene features, were characterized by similarity to fully-supported gene models and the presence of frameshifts or premature translational stop codons. Significant difference in the length of duplicated genes within segmentally-duplicated regions was the optimal indicator of pseudogenization. Among the 816 pseudogenes for which a probable origin could be determined, 75% originated from gene duplication events while 25% were the result of retrotransposition events. A total of 12% of the pseudogenes were expressed. Finally, F-box proteins, BTB/POZ proteins, terpene synthases, chalcone synthases and cytochrome P450 protein families were found to harbor large numbers of pseudogenes. CONCLUSION: These pseudogenes still have a detectable open reading frame and are thus distinct from pseudogenes detected within intergenic regions which typically lack definable open reading frames. Families containing the highest number of pseudogenes are fast-evolving families involved in ubiquitination and secondary metabolism. BioMed Central 2009-07-16 /pmc/articles/PMC2724416/ /pubmed/19607679 http://dx.doi.org/10.1186/1471-2164-10-317 Text en Copyright © 2009 Thibaud-Nissen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Thibaud-Nissen, Françoise Ouyang, Shu Buell, C Robin Identification and characterization of pseudogenes in the rice gene complement
title	Identification and characterization of pseudogenes in the rice gene complement
title_full	Identification and characterization of pseudogenes in the rice gene complement
title_fullStr	Identification and characterization of pseudogenes in the rice gene complement
title_full_unstemmed	Identification and characterization of pseudogenes in the rice gene complement
title_short	Identification and characterization of pseudogenes in the rice gene complement
title_sort	identification and characterization of pseudogenes in the rice gene complement
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2724416/ https://www.ncbi.nlm.nih.gov/pubmed/19607679 http://dx.doi.org/10.1186/1471-2164-10-317
work_keys_str_mv	AT thibaudnissenfrancoise identificationandcharacterizationofpseudogenesinthericegenecomplement AT ouyangshu identificationandcharacterizationofpseudogenesinthericegenecomplement AT buellcrobin identificationandcharacterizationofpseudogenesinthericegenecomplement

Identification and characterization of pseudogenes in the rice gene complement

Ejemplares similares