Cargando…

Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana

BACKGROUND: The availability of genome and transcriptome sequences for a number of species permits the identification and characterization of conserved as well as divergent genes such as lineage-specific genes which have no detectable sequence similarity to genes from other lineages. While genes con...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Haining, Moghe, Gaurav, Ouyang, Shu, Iezzoni, Amy, Shiu, Shin-Han, Gu, Xun, Buell, C Robin
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2829037/
https://www.ncbi.nlm.nih.gov/pubmed/20152032
http://dx.doi.org/10.1186/1471-2148-10-41
_version_ 1782178065481203712
author Lin, Haining
Moghe, Gaurav
Ouyang, Shu
Iezzoni, Amy
Shiu, Shin-Han
Gu, Xun
Buell, C Robin
author_facet Lin, Haining
Moghe, Gaurav
Ouyang, Shu
Iezzoni, Amy
Shiu, Shin-Han
Gu, Xun
Buell, C Robin
author_sort Lin, Haining
collection PubMed
description BACKGROUND: The availability of genome and transcriptome sequences for a number of species permits the identification and characterization of conserved as well as divergent genes such as lineage-specific genes which have no detectable sequence similarity to genes from other lineages. While genes conserved among taxa provide insight into the core processes among species, lineage-specific genes provide insights into evolutionary processes and biological functions that are likely clade or species specific. RESULTS: Comparative analyses using the Arabidopsis thaliana genome and sequences from 178 other species within the Plant Kingdom enabled the identification of 24,624 A. thaliana genes (91.7%) that were termed Evolutionary Conserved (EC) as defined by sequence similarity to a database entry as well as two sets of lineage-specific genes within A. thaliana. One of the A. thaliana lineage-specific gene sets share sequence similarity only to sequences from species within the Brassicaceae family and are termed Conserved Brassicaceae-Specific Genes (914, 3.4%, CBSG). The other set of A. thaliana lineage-specific genes, the Arabidopsis Lineage-Specific Genes (1,324, 4.9%, ALSG), lack sequence similarity to any sequence outside A. thaliana. While many CBSGs (76.7%) and ALSGs (52.9%) are transcribed, the majority of the CBSGs (76.1%) and ALSGs (94.4%) have no annotated function. Co-expression analysis indicated significant enrichment of the CBSGs and ALSGs in multiple functional categories suggesting their involvement in a wide range of biological functions. Subcellular localization prediction revealed that the CBSGs were significantly enriched in proteins targeted to the secretory pathway (412, 45.1%). Among the 107 putatively secreted CBSGs with known functions, 67 encode a putative pollen coat protein or cysteine-rich protein with sequence similarity to the S-locus cysteine-rich protein that is the pollen determinant controlling allele specific pollen rejection in self-incompatible Brassicaceae species. Overall, the ALSGs and CBSGs were more highly methylated in floral tissue compared to the ECs. Single Nucleotide Polymorphism (SNP) analysis showed an elevated ratio of non-synonymous to synonymous SNPs within the ALSGs (1.99) and CBSGs (1.65) relative to the EC set (0.92), mainly caused by an elevated number of non-synonymous SNPs, indicating that they are fast-evolving at the protein sequence level. CONCLUSIONS: Our analyses suggest that while a significant fraction of the A. thaliana proteome is conserved within the Plant Kingdom, evolutionarily distinct sets of genes that may function in defining biological processes unique to these lineages have arisen within the Brassicaceae and A. thaliana.
format Text
id pubmed-2829037
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28290372010-02-26 Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana Lin, Haining Moghe, Gaurav Ouyang, Shu Iezzoni, Amy Shiu, Shin-Han Gu, Xun Buell, C Robin BMC Evol Biol Research article BACKGROUND: The availability of genome and transcriptome sequences for a number of species permits the identification and characterization of conserved as well as divergent genes such as lineage-specific genes which have no detectable sequence similarity to genes from other lineages. While genes conserved among taxa provide insight into the core processes among species, lineage-specific genes provide insights into evolutionary processes and biological functions that are likely clade or species specific. RESULTS: Comparative analyses using the Arabidopsis thaliana genome and sequences from 178 other species within the Plant Kingdom enabled the identification of 24,624 A. thaliana genes (91.7%) that were termed Evolutionary Conserved (EC) as defined by sequence similarity to a database entry as well as two sets of lineage-specific genes within A. thaliana. One of the A. thaliana lineage-specific gene sets share sequence similarity only to sequences from species within the Brassicaceae family and are termed Conserved Brassicaceae-Specific Genes (914, 3.4%, CBSG). The other set of A. thaliana lineage-specific genes, the Arabidopsis Lineage-Specific Genes (1,324, 4.9%, ALSG), lack sequence similarity to any sequence outside A. thaliana. While many CBSGs (76.7%) and ALSGs (52.9%) are transcribed, the majority of the CBSGs (76.1%) and ALSGs (94.4%) have no annotated function. Co-expression analysis indicated significant enrichment of the CBSGs and ALSGs in multiple functional categories suggesting their involvement in a wide range of biological functions. Subcellular localization prediction revealed that the CBSGs were significantly enriched in proteins targeted to the secretory pathway (412, 45.1%). Among the 107 putatively secreted CBSGs with known functions, 67 encode a putative pollen coat protein or cysteine-rich protein with sequence similarity to the S-locus cysteine-rich protein that is the pollen determinant controlling allele specific pollen rejection in self-incompatible Brassicaceae species. Overall, the ALSGs and CBSGs were more highly methylated in floral tissue compared to the ECs. Single Nucleotide Polymorphism (SNP) analysis showed an elevated ratio of non-synonymous to synonymous SNPs within the ALSGs (1.99) and CBSGs (1.65) relative to the EC set (0.92), mainly caused by an elevated number of non-synonymous SNPs, indicating that they are fast-evolving at the protein sequence level. CONCLUSIONS: Our analyses suggest that while a significant fraction of the A. thaliana proteome is conserved within the Plant Kingdom, evolutionarily distinct sets of genes that may function in defining biological processes unique to these lineages have arisen within the Brassicaceae and A. thaliana. BioMed Central 2010-02-12 /pmc/articles/PMC2829037/ /pubmed/20152032 http://dx.doi.org/10.1186/1471-2148-10-41 Text en Copyright ©2010 Lin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Lin, Haining
Moghe, Gaurav
Ouyang, Shu
Iezzoni, Amy
Shiu, Shin-Han
Gu, Xun
Buell, C Robin
Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana
title Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana
title_full Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana
title_fullStr Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana
title_full_unstemmed Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana
title_short Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana
title_sort comparative analyses reveal distinct sets of lineage-specific genes within arabidopsis thaliana
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2829037/
https://www.ncbi.nlm.nih.gov/pubmed/20152032
http://dx.doi.org/10.1186/1471-2148-10-41
work_keys_str_mv AT linhaining comparativeanalysesrevealdistinctsetsoflineagespecificgeneswithinarabidopsisthaliana
AT moghegaurav comparativeanalysesrevealdistinctsetsoflineagespecificgeneswithinarabidopsisthaliana
AT ouyangshu comparativeanalysesrevealdistinctsetsoflineagespecificgeneswithinarabidopsisthaliana
AT iezzoniamy comparativeanalysesrevealdistinctsetsoflineagespecificgeneswithinarabidopsisthaliana
AT shiushinhan comparativeanalysesrevealdistinctsetsoflineagespecificgeneswithinarabidopsisthaliana
AT guxun comparativeanalysesrevealdistinctsetsoflineagespecificgeneswithinarabidopsisthaliana
AT buellcrobin comparativeanalysesrevealdistinctsetsoflineagespecificgeneswithinarabidopsisthaliana