Cargando…

Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods

BACKGROUND: Phylogenetic profiling encompasses an important set of methodologies for in silico high throughput inference of functional relationships between genes. The simplest profiles represent the distribution of gene presence-absence in a set of species as a sequence of 0's and 1's, an...

Descripción completa

Detalles Bibliográficos
Autores principales: Ruano-Rubio, Valentín, Poch, Olivier, Thompson, Julie D
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2787529/
https://www.ncbi.nlm.nih.gov/pubmed/19930674
http://dx.doi.org/10.1186/1471-2105-10-383
_version_ 1782174917520785408
author Ruano-Rubio, Valentín
Poch, Olivier
Thompson, Julie D
author_facet Ruano-Rubio, Valentín
Poch, Olivier
Thompson, Julie D
author_sort Ruano-Rubio, Valentín
collection PubMed
description BACKGROUND: Phylogenetic profiling encompasses an important set of methodologies for in silico high throughput inference of functional relationships between genes. The simplest profiles represent the distribution of gene presence-absence in a set of species as a sequence of 0's and 1's, and it is assumed that functionally related genes will have more similar profiles. The methodology has been successfully used in numerous studies of prokaryotic genomes, although its application in eukaryotes appears problematic, with reported low accuracy due to the complex genomic organization within this domain of life. Recently some groups have proposed an alternative approach based on the correlation of homologous gene group sizes, taking into account all potentially informative genetic events leading to a change in group size, regardless of whether they result in a de novo group gain or total gene group loss. RESULTS: We have compared the performance of classical presence-absence and group size based approaches using a large, diverse set of eukaryotic species. In contrast to most previous comparisons in Eukarya, we take into account the species phylogeny. We also compare the approaches using two different group categories, based on orthology and on domain-sharing. Our results confirm a limited overall performance of phylogenetic profiling in eukaryotes. Although group size based approaches initially showed an increase in performance for the domain-sharing based groups, this seems to be an overestimation due to a simplistic negative control dataset and the choice of null hypothesis rejection criteria. CONCLUSION: Presence-absence profiling represents a more accurate classifier of related versus non-related profile pairs, when the profiles under consideration have enough information content. Group size based approaches provide a complementary means of detecting domain or family level co-evolution between groups that may be elusive to presence-absence profiling. Moreover positive correlation between co-evolution scores and functional links imply that these methods could be used to estimate functional distances between gene groups and to cluster them based on their functional relatedness. This study should have important implications for the future development and application of phylogenetic profiling methods, not only in eukaryotic, but also in prokaryotic datasets.
format Text
id pubmed-2787529
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27875292009-12-03 Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods Ruano-Rubio, Valentín Poch, Olivier Thompson, Julie D BMC Bioinformatics Research article BACKGROUND: Phylogenetic profiling encompasses an important set of methodologies for in silico high throughput inference of functional relationships between genes. The simplest profiles represent the distribution of gene presence-absence in a set of species as a sequence of 0's and 1's, and it is assumed that functionally related genes will have more similar profiles. The methodology has been successfully used in numerous studies of prokaryotic genomes, although its application in eukaryotes appears problematic, with reported low accuracy due to the complex genomic organization within this domain of life. Recently some groups have proposed an alternative approach based on the correlation of homologous gene group sizes, taking into account all potentially informative genetic events leading to a change in group size, regardless of whether they result in a de novo group gain or total gene group loss. RESULTS: We have compared the performance of classical presence-absence and group size based approaches using a large, diverse set of eukaryotic species. In contrast to most previous comparisons in Eukarya, we take into account the species phylogeny. We also compare the approaches using two different group categories, based on orthology and on domain-sharing. Our results confirm a limited overall performance of phylogenetic profiling in eukaryotes. Although group size based approaches initially showed an increase in performance for the domain-sharing based groups, this seems to be an overestimation due to a simplistic negative control dataset and the choice of null hypothesis rejection criteria. CONCLUSION: Presence-absence profiling represents a more accurate classifier of related versus non-related profile pairs, when the profiles under consideration have enough information content. Group size based approaches provide a complementary means of detecting domain or family level co-evolution between groups that may be elusive to presence-absence profiling. Moreover positive correlation between co-evolution scores and functional links imply that these methods could be used to estimate functional distances between gene groups and to cluster them based on their functional relatedness. This study should have important implications for the future development and application of phylogenetic profiling methods, not only in eukaryotic, but also in prokaryotic datasets. BioMed Central 2009-11-24 /pmc/articles/PMC2787529/ /pubmed/19930674 http://dx.doi.org/10.1186/1471-2105-10-383 Text en Copyright ©2009 Ruano-Rubio et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Ruano-Rubio, Valentín
Poch, Olivier
Thompson, Julie D
Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods
title Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods
title_full Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods
title_fullStr Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods
title_full_unstemmed Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods
title_short Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods
title_sort comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2787529/
https://www.ncbi.nlm.nih.gov/pubmed/19930674
http://dx.doi.org/10.1186/1471-2105-10-383
work_keys_str_mv AT ruanorubiovalentin comparisonofeukaryoticphylogeneticprofilingapproachesusingspeciestreeawaremethods
AT pocholivier comparisonofeukaryoticphylogeneticprofilingapproachesusingspeciestreeawaremethods
AT thompsonjulied comparisonofeukaryoticphylogeneticprofilingapproachesusingspeciestreeawaremethods