Cargando…
Characterization of paralogous protein families in rice
BACKGROUND: High gene numbers in plant genomes reflect polyploidy and major gene duplication events. Oryza sativa, cultivated rice, is a diploid monocotyledonous species with a ~390 Mb genome that has undergone segmental duplication of a substantial portion of its genome. This, coupled with other ge...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2275729/ https://www.ncbi.nlm.nih.gov/pubmed/18284697 http://dx.doi.org/10.1186/1471-2229-8-18 |
_version_ | 1782151895920410624 |
---|---|
author | Lin, Haining Ouyang, Shu Egan, Amy Nobuta, Kan Haas, Brian J Zhu, Wei Gu, Xun Silva, Joana C Meyers, Blake C Buell, C Robin |
author_facet | Lin, Haining Ouyang, Shu Egan, Amy Nobuta, Kan Haas, Brian J Zhu, Wei Gu, Xun Silva, Joana C Meyers, Blake C Buell, C Robin |
author_sort | Lin, Haining |
collection | PubMed |
description | BACKGROUND: High gene numbers in plant genomes reflect polyploidy and major gene duplication events. Oryza sativa, cultivated rice, is a diploid monocotyledonous species with a ~390 Mb genome that has undergone segmental duplication of a substantial portion of its genome. This, coupled with other genetic events such as tandem duplications, has resulted in a substantial number of its genes, and resulting proteins, occurring in paralogous families. RESULTS: Using a computational pipeline that utilizes Pfam and novel protein domains, we characterized paralogous families in rice and compared these with paralogous families in the model dicotyledonous diploid species, Arabidopsis thaliana. Arabidopsis, which has undergone genome duplication as well, has a substantially smaller genome (~120 Mb) and gene complement compared to rice. Overall, 53% and 68% of the non-transposable element-related rice and Arabidopsis proteins could be classified into paralogous protein families, respectively. Singleton and paralogous family genes differed substantially in their likelihood of encoding a protein of known or putative function; 26% and 66% of singleton genes compared to 73% and 96% of the paralogous family genes encode a known or putative protein in rice and Arabidopsis, respectively. Furthermore, a major skew in the distribution of specific gene function was observed; a total of 17 Gene Ontology categories in both rice and Arabidopsis were statistically significant in their differential distribution between paralogous family and singleton proteins. In contrast to mammalian organisms, we found that duplicated genes in rice and Arabidopsis tend to have more alternative splice forms. Using data from Massively Parallel Signature Sequencing, we show that a significant portion of the duplicated genes in rice show divergent expression although a correlation between sequence divergence and correlation of expression could be seen in very young genes. CONCLUSION: Collectively, these data suggest that while co-regulation and conserved function are present in some paralogous protein family members, evolutionary pressures have resulted in functional divergence with differential expression patterns. |
format | Text |
id | pubmed-2275729 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-22757292008-03-27 Characterization of paralogous protein families in rice Lin, Haining Ouyang, Shu Egan, Amy Nobuta, Kan Haas, Brian J Zhu, Wei Gu, Xun Silva, Joana C Meyers, Blake C Buell, C Robin BMC Plant Biol Research Article BACKGROUND: High gene numbers in plant genomes reflect polyploidy and major gene duplication events. Oryza sativa, cultivated rice, is a diploid monocotyledonous species with a ~390 Mb genome that has undergone segmental duplication of a substantial portion of its genome. This, coupled with other genetic events such as tandem duplications, has resulted in a substantial number of its genes, and resulting proteins, occurring in paralogous families. RESULTS: Using a computational pipeline that utilizes Pfam and novel protein domains, we characterized paralogous families in rice and compared these with paralogous families in the model dicotyledonous diploid species, Arabidopsis thaliana. Arabidopsis, which has undergone genome duplication as well, has a substantially smaller genome (~120 Mb) and gene complement compared to rice. Overall, 53% and 68% of the non-transposable element-related rice and Arabidopsis proteins could be classified into paralogous protein families, respectively. Singleton and paralogous family genes differed substantially in their likelihood of encoding a protein of known or putative function; 26% and 66% of singleton genes compared to 73% and 96% of the paralogous family genes encode a known or putative protein in rice and Arabidopsis, respectively. Furthermore, a major skew in the distribution of specific gene function was observed; a total of 17 Gene Ontology categories in both rice and Arabidopsis were statistically significant in their differential distribution between paralogous family and singleton proteins. In contrast to mammalian organisms, we found that duplicated genes in rice and Arabidopsis tend to have more alternative splice forms. Using data from Massively Parallel Signature Sequencing, we show that a significant portion of the duplicated genes in rice show divergent expression although a correlation between sequence divergence and correlation of expression could be seen in very young genes. CONCLUSION: Collectively, these data suggest that while co-regulation and conserved function are present in some paralogous protein family members, evolutionary pressures have resulted in functional divergence with differential expression patterns. BioMed Central 2008-02-19 /pmc/articles/PMC2275729/ /pubmed/18284697 http://dx.doi.org/10.1186/1471-2229-8-18 Text en Copyright © 2008 Lin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Lin, Haining Ouyang, Shu Egan, Amy Nobuta, Kan Haas, Brian J Zhu, Wei Gu, Xun Silva, Joana C Meyers, Blake C Buell, C Robin Characterization of paralogous protein families in rice |
title | Characterization of paralogous protein families in rice |
title_full | Characterization of paralogous protein families in rice |
title_fullStr | Characterization of paralogous protein families in rice |
title_full_unstemmed | Characterization of paralogous protein families in rice |
title_short | Characterization of paralogous protein families in rice |
title_sort | characterization of paralogous protein families in rice |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2275729/ https://www.ncbi.nlm.nih.gov/pubmed/18284697 http://dx.doi.org/10.1186/1471-2229-8-18 |
work_keys_str_mv | AT linhaining characterizationofparalogousproteinfamiliesinrice AT ouyangshu characterizationofparalogousproteinfamiliesinrice AT eganamy characterizationofparalogousproteinfamiliesinrice AT nobutakan characterizationofparalogousproteinfamiliesinrice AT haasbrianj characterizationofparalogousproteinfamiliesinrice AT zhuwei characterizationofparalogousproteinfamiliesinrice AT guxun characterizationofparalogousproteinfamiliesinrice AT silvajoanac characterizationofparalogousproteinfamiliesinrice AT meyersblakec characterizationofparalogousproteinfamiliesinrice AT buellcrobin characterizationofparalogousproteinfamiliesinrice |