Cargando…

Comparative assessment of performance and genome dependence among phylogenetic profiling methods

BACKGROUND: The rapidly increasing speed with which genome sequence data can be generated will be accompanied by an exponential increase in the number of sequenced eukaryotes. With the increasing number of sequenced eukaryotic genomes comes a need for bioinformatic techniques to aid in functional an...

Descripción completa

Detalles Bibliográficos
Autores principales: Snitkin, Evan S, Gustafson, Adam M, Mellor, Joseph, Wu, Jie, DeLisi, Charles
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1592128/
https://www.ncbi.nlm.nih.gov/pubmed/17005048
http://dx.doi.org/10.1186/1471-2105-7-420
_version_ 1782130382177566720
author Snitkin, Evan S
Gustafson, Adam M
Mellor, Joseph
Wu, Jie
DeLisi, Charles
author_facet Snitkin, Evan S
Gustafson, Adam M
Mellor, Joseph
Wu, Jie
DeLisi, Charles
author_sort Snitkin, Evan S
collection PubMed
description BACKGROUND: The rapidly increasing speed with which genome sequence data can be generated will be accompanied by an exponential increase in the number of sequenced eukaryotes. With the increasing number of sequenced eukaryotic genomes comes a need for bioinformatic techniques to aid in functional annotation. Ideally, genome context based techniques such as proximity, fusion, and phylogenetic profiling, which have been so successful in prokaryotes, could be utilized in eukaryotes. Here we explore the application of phylogenetic profiling, a method that exploits the evolutionary co-occurrence of genes in the assignment of functional linkages, to eukaryotic genomes. RESULTS: In order to evaluate the performance of phylogenetic profiling in eukaryotes, we assessed the relative performance of commonly used profile construction techniques and genome compositions in predicting functional linkages in both prokaryotic and eukaryotic organisms. When predicting linkages in E. coli with a prokaryotic profile, the use of continuous values constructed from transformed BLAST bit-scores performed better than profiles composed of discretized E-values; the use of discretized E-values resulted in more accurate linkages when using S. cerevisiae as the query organism. Extending this analysis by incorporating several eukaryotic genomes in profiles containing a majority of prokaryotes resulted in similar overall accuracy, but with a surprising reduction in pathway diversity among the most significant linkages. Furthermore, the application of phylogenetic profiling using profiles composed of only eukaryotes resulted in the loss of the strong correlation between common KEGG pathway membership and profile similarity score. Profile construction methods, orthology definitions, ontology and domain complexity were explored as possible sources of the poor performance of eukaryotic profiles, but with no improvement in results. CONCLUSION: Given the current set of completely sequenced eukaryotic organisms, phylogenetic profiling using profiles generated from any of the commonly used techniques was found to yield extremely poor results. These findings imply genome-specific requirements for constructing functionally relevant phylogenetic profiles, and suggest that differences in the evolutionary history between different kingdoms might generally limit the usefulness of phylogenetic profiling in eukaryotes.
format Text
id pubmed-1592128
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15921282006-10-05 Comparative assessment of performance and genome dependence among phylogenetic profiling methods Snitkin, Evan S Gustafson, Adam M Mellor, Joseph Wu, Jie DeLisi, Charles BMC Bioinformatics Research Article BACKGROUND: The rapidly increasing speed with which genome sequence data can be generated will be accompanied by an exponential increase in the number of sequenced eukaryotes. With the increasing number of sequenced eukaryotic genomes comes a need for bioinformatic techniques to aid in functional annotation. Ideally, genome context based techniques such as proximity, fusion, and phylogenetic profiling, which have been so successful in prokaryotes, could be utilized in eukaryotes. Here we explore the application of phylogenetic profiling, a method that exploits the evolutionary co-occurrence of genes in the assignment of functional linkages, to eukaryotic genomes. RESULTS: In order to evaluate the performance of phylogenetic profiling in eukaryotes, we assessed the relative performance of commonly used profile construction techniques and genome compositions in predicting functional linkages in both prokaryotic and eukaryotic organisms. When predicting linkages in E. coli with a prokaryotic profile, the use of continuous values constructed from transformed BLAST bit-scores performed better than profiles composed of discretized E-values; the use of discretized E-values resulted in more accurate linkages when using S. cerevisiae as the query organism. Extending this analysis by incorporating several eukaryotic genomes in profiles containing a majority of prokaryotes resulted in similar overall accuracy, but with a surprising reduction in pathway diversity among the most significant linkages. Furthermore, the application of phylogenetic profiling using profiles composed of only eukaryotes resulted in the loss of the strong correlation between common KEGG pathway membership and profile similarity score. Profile construction methods, orthology definitions, ontology and domain complexity were explored as possible sources of the poor performance of eukaryotic profiles, but with no improvement in results. CONCLUSION: Given the current set of completely sequenced eukaryotic organisms, phylogenetic profiling using profiles generated from any of the commonly used techniques was found to yield extremely poor results. These findings imply genome-specific requirements for constructing functionally relevant phylogenetic profiles, and suggest that differences in the evolutionary history between different kingdoms might generally limit the usefulness of phylogenetic profiling in eukaryotes. BioMed Central 2006-09-27 /pmc/articles/PMC1592128/ /pubmed/17005048 http://dx.doi.org/10.1186/1471-2105-7-420 Text en Copyright © 2006 Snitkin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Snitkin, Evan S
Gustafson, Adam M
Mellor, Joseph
Wu, Jie
DeLisi, Charles
Comparative assessment of performance and genome dependence among phylogenetic profiling methods
title Comparative assessment of performance and genome dependence among phylogenetic profiling methods
title_full Comparative assessment of performance and genome dependence among phylogenetic profiling methods
title_fullStr Comparative assessment of performance and genome dependence among phylogenetic profiling methods
title_full_unstemmed Comparative assessment of performance and genome dependence among phylogenetic profiling methods
title_short Comparative assessment of performance and genome dependence among phylogenetic profiling methods
title_sort comparative assessment of performance and genome dependence among phylogenetic profiling methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1592128/
https://www.ncbi.nlm.nih.gov/pubmed/17005048
http://dx.doi.org/10.1186/1471-2105-7-420
work_keys_str_mv AT snitkinevans comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods
AT gustafsonadamm comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods
AT mellorjoseph comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods
AT wujie comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods
AT delisicharles comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods