Cargando…
An improved method for identifying functionally linked proteins using phylogenetic profiles
BACKGROUND: Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likeliho...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892086/ https://www.ncbi.nlm.nih.gov/pubmed/17570150 http://dx.doi.org/10.1186/1471-2105-8-S4-S7 |
_version_ | 1782133823925911552 |
---|---|
author | Cokus, Shawn Mizutani, Sayaka Pellegrini, Matteo |
author_facet | Cokus, Shawn Mizutani, Sayaka Pellegrini, Matteo |
author_sort | Cokus, Shawn |
collection | PubMed |
description | BACKGROUND: Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likelihood that the two proteins co-evolve. Some methods ignore phylogenetic relationships between organisms while others account for such with metrics that explicitly model the likelihood of two proteins co-evolving on a tree. The latter methods more sensitively detect co-evolving proteins, but at a significant computational cost. Here we propose a novel heuristic to improve phylogenetic profile analysis that accounts for phylogenetic relationships between genomes in a computationally efficient fashion. We first order the genomes within profiles and then enumerate runs of consecutive matches and accurately compute the probability of observing these. We hypothesize that profiles with many runs are more likely to involve functionally related proteins than profiles in which all the matches are concentrated in one interval of the tree. RESULTS: We compared our approach to various previously published methods that both ignore and incorporate the underlying phylogeny between organisms. To evaluate performance, we compare the functional similarity of rank-ordered lists of protein pairs that share similar phylogenetic profiles by assessing significance of overlap in their Gene Ontology annotations. Accounting for runs in phylogenetic profile matches improves our ability to identify functionally related pairs of proteins. Furthermore, the networks that result from our approach tend to have smaller clusters of co-evolving proteins than networks computed using previous approaches and are thus more useful for inferring functional relationships. Finally, we report that our approach is orders of magnitude more computationally efficient than full tree-based methods. CONCLUSION: We have developed an improved method for analyzing phylogenetic profiles. The method allows us to more accurately and efficiently infer functional relationships between proteins based on these profiles than other published approaches. As the number of fully sequenced genomes increases, it becomes more important to account for evolutionary relationships among organisms in comparative analyses. Our approach, therefore, serves as an important example of how these relationships may be accounted for in an efficient manner. |
format | Text |
id | pubmed-1892086 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-18920862007-06-15 An improved method for identifying functionally linked proteins using phylogenetic profiles Cokus, Shawn Mizutani, Sayaka Pellegrini, Matteo BMC Bioinformatics Proceedings BACKGROUND: Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likelihood that the two proteins co-evolve. Some methods ignore phylogenetic relationships between organisms while others account for such with metrics that explicitly model the likelihood of two proteins co-evolving on a tree. The latter methods more sensitively detect co-evolving proteins, but at a significant computational cost. Here we propose a novel heuristic to improve phylogenetic profile analysis that accounts for phylogenetic relationships between genomes in a computationally efficient fashion. We first order the genomes within profiles and then enumerate runs of consecutive matches and accurately compute the probability of observing these. We hypothesize that profiles with many runs are more likely to involve functionally related proteins than profiles in which all the matches are concentrated in one interval of the tree. RESULTS: We compared our approach to various previously published methods that both ignore and incorporate the underlying phylogeny between organisms. To evaluate performance, we compare the functional similarity of rank-ordered lists of protein pairs that share similar phylogenetic profiles by assessing significance of overlap in their Gene Ontology annotations. Accounting for runs in phylogenetic profile matches improves our ability to identify functionally related pairs of proteins. Furthermore, the networks that result from our approach tend to have smaller clusters of co-evolving proteins than networks computed using previous approaches and are thus more useful for inferring functional relationships. Finally, we report that our approach is orders of magnitude more computationally efficient than full tree-based methods. CONCLUSION: We have developed an improved method for analyzing phylogenetic profiles. The method allows us to more accurately and efficiently infer functional relationships between proteins based on these profiles than other published approaches. As the number of fully sequenced genomes increases, it becomes more important to account for evolutionary relationships among organisms in comparative analyses. Our approach, therefore, serves as an important example of how these relationships may be accounted for in an efficient manner. BioMed Central 2007-05-22 /pmc/articles/PMC1892086/ /pubmed/17570150 http://dx.doi.org/10.1186/1471-2105-8-S4-S7 Text en Copyright © 2007 Cokus et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Cokus, Shawn Mizutani, Sayaka Pellegrini, Matteo An improved method for identifying functionally linked proteins using phylogenetic profiles |
title | An improved method for identifying functionally linked proteins using phylogenetic profiles |
title_full | An improved method for identifying functionally linked proteins using phylogenetic profiles |
title_fullStr | An improved method for identifying functionally linked proteins using phylogenetic profiles |
title_full_unstemmed | An improved method for identifying functionally linked proteins using phylogenetic profiles |
title_short | An improved method for identifying functionally linked proteins using phylogenetic profiles |
title_sort | improved method for identifying functionally linked proteins using phylogenetic profiles |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892086/ https://www.ncbi.nlm.nih.gov/pubmed/17570150 http://dx.doi.org/10.1186/1471-2105-8-S4-S7 |
work_keys_str_mv | AT cokusshawn animprovedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles AT mizutanisayaka animprovedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles AT pellegrinimatteo animprovedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles AT cokusshawn improvedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles AT mizutanisayaka improvedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles AT pellegrinimatteo improvedmethodforidentifyingfunctionallylinkedproteinsusingphylogeneticprofiles |