Cargando…

Protein domains and architectural innovation in plant-associated Proteobacteria

BACKGROUND: Evolution of new complex biological behaviour tends to arise by novel combinations of existing building blocks. The functional and evolutionary building blocks of the proteome are protein domains, the function of a protein being dependent on its constituent domains. We clustered complete...

Descripción completa

Detalles Bibliográficos
Autores principales: Studholme, David J, Downie, J Allan, Preston, Gail M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC554113/
https://www.ncbi.nlm.nih.gov/pubmed/15715905
http://dx.doi.org/10.1186/1471-2164-6-17
_version_ 1782122509109297152
author Studholme, David J
Downie, J Allan
Preston, Gail M
author_facet Studholme, David J
Downie, J Allan
Preston, Gail M
author_sort Studholme, David J
collection PubMed
description BACKGROUND: Evolution of new complex biological behaviour tends to arise by novel combinations of existing building blocks. The functional and evolutionary building blocks of the proteome are protein domains, the function of a protein being dependent on its constituent domains. We clustered completely-sequenced proteomes of prokaryotes on the basis of their protein domain content, as defined by Pfam (release 16.0). This revealed that, although there was a correlation between phylogeny and domain content, other factors also have an influence. This observation motivated an investigation of the relationship between an organism's lifestyle and the complement of domains and domain architectures found within its proteome. RESULTS: We took a census of all protein domains and domain combinations (architectures) encoded in the completely-sequenced proteobacterial genomes. Nine protein domain families were identified that are found in phylogenetically disparate plant-associated bacteria but are absent from non-plant-associated bacteria. Most of these are known to play a role in the plant-associated lifestyle, but they also included domain of unknown function DUF1427, which is found in plant symbionts and pathogens of the alpha-, beta- and gamma-Proteobacteria, but not known in any other organism. Further, several domains were identified as being restricted to phytobacteria and Eukaryotes. One example is the RolB/RolC glucosidase family, which is found only in Agrobacterium species and in plants. We identified the 0.5% of Pfam protein domain families that were most significantly over-represented in the plant-associated Proteobacteria with respect to the background frequencies in the whole set of available proteobacterial proteomes. These included guanylate cyclase, domains implicated in aromatic catabolism, cellulase and several domains of unknown function. We identified 459 unique domain architectures found in phylogenetically diverse plant pathogens and symbionts that were absent from non-pathogenic and non-symbiotic relatives. The vast majority of these were restricted to a single species or several closely related species and so their distributions could be better explained by phylogeny than by lifestyle. However, several architectures were found in two or more very distantly related phytobacteria but absent from non-plant-associated bacteria. Many of the proteins with these unique architectures are predicted to be secreted. In Pseudomonas syringae pathovar tomato, those genes encoding genes with novel domain architectures tended to have atypical GC contents and were adjacent to insertion sequence elements and phage-like sequences, suggesting acquisition by horizontal transfer. CONCLUSIONS: By identifying domains and architectures unique to plant pathogens and symbionts, we highlighted candidate proteins for involvement in plant-associated bacterial lifestyles. Given that characterisation of novel gene products in vivo and in vitro is time-consuming and expensive, this computational approach may be useful for reducing experimental search space. Furthermore we discuss the biological significance of novel proteins highlighted by this study in the context of plant-associated lifestyles.
format Text
id pubmed-554113
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5541132005-03-13 Protein domains and architectural innovation in plant-associated Proteobacteria Studholme, David J Downie, J Allan Preston, Gail M BMC Genomics Research Article BACKGROUND: Evolution of new complex biological behaviour tends to arise by novel combinations of existing building blocks. The functional and evolutionary building blocks of the proteome are protein domains, the function of a protein being dependent on its constituent domains. We clustered completely-sequenced proteomes of prokaryotes on the basis of their protein domain content, as defined by Pfam (release 16.0). This revealed that, although there was a correlation between phylogeny and domain content, other factors also have an influence. This observation motivated an investigation of the relationship between an organism's lifestyle and the complement of domains and domain architectures found within its proteome. RESULTS: We took a census of all protein domains and domain combinations (architectures) encoded in the completely-sequenced proteobacterial genomes. Nine protein domain families were identified that are found in phylogenetically disparate plant-associated bacteria but are absent from non-plant-associated bacteria. Most of these are known to play a role in the plant-associated lifestyle, but they also included domain of unknown function DUF1427, which is found in plant symbionts and pathogens of the alpha-, beta- and gamma-Proteobacteria, but not known in any other organism. Further, several domains were identified as being restricted to phytobacteria and Eukaryotes. One example is the RolB/RolC glucosidase family, which is found only in Agrobacterium species and in plants. We identified the 0.5% of Pfam protein domain families that were most significantly over-represented in the plant-associated Proteobacteria with respect to the background frequencies in the whole set of available proteobacterial proteomes. These included guanylate cyclase, domains implicated in aromatic catabolism, cellulase and several domains of unknown function. We identified 459 unique domain architectures found in phylogenetically diverse plant pathogens and symbionts that were absent from non-pathogenic and non-symbiotic relatives. The vast majority of these were restricted to a single species or several closely related species and so their distributions could be better explained by phylogeny than by lifestyle. However, several architectures were found in two or more very distantly related phytobacteria but absent from non-plant-associated bacteria. Many of the proteins with these unique architectures are predicted to be secreted. In Pseudomonas syringae pathovar tomato, those genes encoding genes with novel domain architectures tended to have atypical GC contents and were adjacent to insertion sequence elements and phage-like sequences, suggesting acquisition by horizontal transfer. CONCLUSIONS: By identifying domains and architectures unique to plant pathogens and symbionts, we highlighted candidate proteins for involvement in plant-associated bacterial lifestyles. Given that characterisation of novel gene products in vivo and in vitro is time-consuming and expensive, this computational approach may be useful for reducing experimental search space. Furthermore we discuss the biological significance of novel proteins highlighted by this study in the context of plant-associated lifestyles. BioMed Central 2005-02-16 /pmc/articles/PMC554113/ /pubmed/15715905 http://dx.doi.org/10.1186/1471-2164-6-17 Text en Copyright © 2005 Studholme et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Studholme, David J
Downie, J Allan
Preston, Gail M
Protein domains and architectural innovation in plant-associated Proteobacteria
title Protein domains and architectural innovation in plant-associated Proteobacteria
title_full Protein domains and architectural innovation in plant-associated Proteobacteria
title_fullStr Protein domains and architectural innovation in plant-associated Proteobacteria
title_full_unstemmed Protein domains and architectural innovation in plant-associated Proteobacteria
title_short Protein domains and architectural innovation in plant-associated Proteobacteria
title_sort protein domains and architectural innovation in plant-associated proteobacteria
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC554113/
https://www.ncbi.nlm.nih.gov/pubmed/15715905
http://dx.doi.org/10.1186/1471-2164-6-17
work_keys_str_mv AT studholmedavidj proteindomainsandarchitecturalinnovationinplantassociatedproteobacteria
AT downiejallan proteindomainsandarchitecturalinnovationinplantassociatedproteobacteria
AT prestongailm proteindomainsandarchitecturalinnovationinplantassociatedproteobacteria