Cargando…

Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes

“Phylogenetic profiling” is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence–absence profiles of orthologous genes is now a common and powerful way of identifying functionall...

Descripción completa

Detalles Bibliográficos
Autores principales: Ranea, Juan A. G, Yeats, Corin, Grant, Alastair, Orengo, Christine A
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2098864/
https://www.ncbi.nlm.nih.gov/pubmed/18052542
http://dx.doi.org/10.1371/journal.pcbi.0030237
_version_ 1782138283190386688
author Ranea, Juan A. G
Yeats, Corin
Grant, Alastair
Orengo, Christine A
author_facet Ranea, Juan A. G
Yeats, Corin
Grant, Alastair
Orengo, Christine A
author_sort Ranea, Juan A. G
collection PubMed
description “Phylogenetic profiling” is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence–absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence–absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence–absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity—from 30% to 100%—and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will “auto-tune” with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence–absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes.
format Text
id pubmed-2098864
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-20988642007-11-29 Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes Ranea, Juan A. G Yeats, Corin Grant, Alastair Orengo, Christine A PLoS Comput Biol Research Article “Phylogenetic profiling” is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence–absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence–absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence–absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity—from 30% to 100%—and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will “auto-tune” with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence–absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes. Public Library of Science 2007-11 2007-11-30 /pmc/articles/PMC2098864/ /pubmed/18052542 http://dx.doi.org/10.1371/journal.pcbi.0030237 Text en Copyright: © 2007 Ranea et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Ranea, Juan A. G
Yeats, Corin
Grant, Alastair
Orengo, Christine A
Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes
title Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes
title_full Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes
title_fullStr Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes
title_full_unstemmed Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes
title_short Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes
title_sort predicting protein function with hierarchical phylogenetic profiles: the gene3d phylo-tuner method applied to eukaryotic genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2098864/
https://www.ncbi.nlm.nih.gov/pubmed/18052542
http://dx.doi.org/10.1371/journal.pcbi.0030237
work_keys_str_mv AT raneajuanag predictingproteinfunctionwithhierarchicalphylogeneticprofilesthegene3dphylotunermethodappliedtoeukaryoticgenomes
AT yeatscorin predictingproteinfunctionwithhierarchicalphylogeneticprofilesthegene3dphylotunermethodappliedtoeukaryoticgenomes
AT grantalastair predictingproteinfunctionwithhierarchicalphylogeneticprofilesthegene3dphylotunermethodappliedtoeukaryoticgenomes
AT orengochristinea predictingproteinfunctionwithhierarchicalphylogeneticprofilesthegene3dphylotunermethodappliedtoeukaryoticgenomes