Cargando…

An improved method for identifying functionally linked proteins using phylogenetic profiles

BACKGROUND: Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likeliho...

Descripción completa

Detalles Bibliográficos
Autores principales: Cokus, Shawn, Mizutani, Sayaka, Pellegrini, Matteo
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892086/
https://www.ncbi.nlm.nih.gov/pubmed/17570150
http://dx.doi.org/10.1186/1471-2105-8-S4-S7
_version_ 1782133823925911552
author Cokus, Shawn
Mizutani, Sayaka
Pellegrini, Matteo
author_facet Cokus, Shawn
Mizutani, Sayaka
Pellegrini, Matteo
author_sort Cokus, Shawn
collection PubMed
description BACKGROUND: Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likelihood that the two proteins co-evolve. Some methods ignore phylogenetic relationships between organisms while others account for such with metrics that explicitly model the likelihood of two proteins co-evolving on a tree. The latter methods more sensitively detect co-evolving proteins, but at a significant computational cost. Here we propose a novel heuristic to improve phylogenetic profile analysis that accounts for phylogenetic relationships between genomes in a computationally efficient fashion. We first order the genomes within profiles and then enumerate runs of consecutive matches and accurately compute the probability of observing these. We hypothesize that profiles with many runs are more likely to involve functionally related proteins than profiles in which all the matches are concentrated in one interval of the tree. RESULTS: We compared our approach to various previously published methods that both ignore and incorporate the underlying phylogeny between organisms. To evaluate performance, we compare the functional similarity of rank-ordered lists of protein pairs that share similar phylogenetic profiles by assessing significance of overlap in their Gene Ontology annotations. Accounting for runs in phylogenetic profile matches improves our ability to identify functionally related pairs of proteins. Furthermore, the networks that result from our approach tend to have smaller clusters of co-evolving proteins than networks computed using previous approaches and are thus more useful for inferring functional relationships. Finally, we report that our approach is orders of magnitude more computationally efficient than full tree-based methods. CONCLUSION: We have developed an improved method for analyzing phylogenetic profiles. The method allows us to more accurately and efficiently infer functional relationships between proteins based on these profiles than other published approaches. As the number of fully sequenced genomes increases, it becomes more important to account for evolutionary relationships among organisms in comparative analyses. Our approach, therefore, serves as an important example of how these relationships may be accounted for in an efficient manner.
format Text
id pubmed-1892086
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18920862007-06-15 An improved method for identifying functionally linked proteins using phylogenetic profiles Cokus, Shawn Mizutani, Sayaka Pellegrini, Matteo BMC Bioinformatics Proceedings BACKGROUND: Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likelihood that the two proteins co-evolve. Some methods ignore phylogenetic relationships between organisms while others account for such with metrics that explicitly model the likelihood of two proteins co-evolving on a tree. The latter methods more sensitively detect co-evolving proteins, but at a significant computational cost. Here we propose a novel heuristic to improve phylogenetic profile analysis that accounts for phylogenetic relationships between genomes in a computationally efficient fashion. We first order the genomes within profiles and then enumerate runs of consecutive matches and accurately compute the probability of observing these. We hypothesize that profiles with many runs are more likely to involve functionally related proteins than profiles in which all the matches are concentrated in one interval of the tree. RESULTS: We compared our approach to various previously published methods that both ignore and incorporate the underlying phylogeny between organisms. To evaluate performance, we compare the functional similarity of rank-ordered lists of protein pairs that share similar phylogenetic profiles by assessing significance of overlap in their Gene Ontology annotations. Accounting for runs in phylogenetic profile matches improves our ability to identify functionally related pairs of proteins. Furthermore, the networks that result from our approach tend to have smaller clusters of co-evolving proteins than networks computed using previous approaches and are thus more useful for inferring functional relationships. Finally, we report that our approach is orders of magnitude more computationally efficient than full tree-based methods. CONCLUSION: We have developed an improved method for analyzing phylogenetic profiles. The method allows us to more accurately and efficiently infer functional relationships between proteins based on these profiles than other published approaches. As the number of fully sequenced genomes increases, it becomes more important to account for evolutionary relationships among organisms in comparative analyses. Our approach, therefore, serves as an important example of how these relationships may be accounted for in an efficient manner. BioMed Central 2007-05-22 /pmc/articles/PMC1892086/ /pubmed/17570150 http://dx.doi.org/10.1186/1471-2105-8-S4-S7 Text en Copyright © 2007 Cokus et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Cokus, Shawn
Mizutani, Sayaka
Pellegrini, Matteo
An improved method for identifying functionally linked proteins using phylogenetic profiles
title An improved method for identifying functionally linked proteins using phylogenetic profiles
title_full An improved method for identifying functionally linked proteins using phylogenetic profiles
title_fullStr An improved method for identifying functionally linked proteins using phylogenetic profiles
title_full_unstemmed An improved method for identifying functionally linked proteins using phylogenetic profiles
title_short An improved method for identifying functionally linked proteins using phylogenetic profiles
title_sort improved method for identifying functionally linked proteins using phylogenetic profiles
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892086/
https://www.ncbi.nlm.nih.gov/pubmed/17570150
http://dx.doi.org/10.1186/1471-2105-8-S4-S7
work_keys_str_mv AT cokusshawn animprovedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles
AT mizutanisayaka animprovedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles
AT pellegrinimatteo animprovedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles
AT cokusshawn improvedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles
AT mizutanisayaka improvedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles
AT pellegrinimatteo improvedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles