Cargando…
Comparative assessment of performance and genome dependence among phylogenetic profiling methods
BACKGROUND: The rapidly increasing speed with which genome sequence data can be generated will be accompanied by an exponential increase in the number of sequenced eukaryotes. With the increasing number of sequenced eukaryotic genomes comes a need for bioinformatic techniques to aid in functional an...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1592128/ https://www.ncbi.nlm.nih.gov/pubmed/17005048 http://dx.doi.org/10.1186/1471-2105-7-420 |
_version_ | 1782130382177566720 |
---|---|
author | Snitkin, Evan S Gustafson, Adam M Mellor, Joseph Wu, Jie DeLisi, Charles |
author_facet | Snitkin, Evan S Gustafson, Adam M Mellor, Joseph Wu, Jie DeLisi, Charles |
author_sort | Snitkin, Evan S |
collection | PubMed |
description | BACKGROUND: The rapidly increasing speed with which genome sequence data can be generated will be accompanied by an exponential increase in the number of sequenced eukaryotes. With the increasing number of sequenced eukaryotic genomes comes a need for bioinformatic techniques to aid in functional annotation. Ideally, genome context based techniques such as proximity, fusion, and phylogenetic profiling, which have been so successful in prokaryotes, could be utilized in eukaryotes. Here we explore the application of phylogenetic profiling, a method that exploits the evolutionary co-occurrence of genes in the assignment of functional linkages, to eukaryotic genomes. RESULTS: In order to evaluate the performance of phylogenetic profiling in eukaryotes, we assessed the relative performance of commonly used profile construction techniques and genome compositions in predicting functional linkages in both prokaryotic and eukaryotic organisms. When predicting linkages in E. coli with a prokaryotic profile, the use of continuous values constructed from transformed BLAST bit-scores performed better than profiles composed of discretized E-values; the use of discretized E-values resulted in more accurate linkages when using S. cerevisiae as the query organism. Extending this analysis by incorporating several eukaryotic genomes in profiles containing a majority of prokaryotes resulted in similar overall accuracy, but with a surprising reduction in pathway diversity among the most significant linkages. Furthermore, the application of phylogenetic profiling using profiles composed of only eukaryotes resulted in the loss of the strong correlation between common KEGG pathway membership and profile similarity score. Profile construction methods, orthology definitions, ontology and domain complexity were explored as possible sources of the poor performance of eukaryotic profiles, but with no improvement in results. CONCLUSION: Given the current set of completely sequenced eukaryotic organisms, phylogenetic profiling using profiles generated from any of the commonly used techniques was found to yield extremely poor results. These findings imply genome-specific requirements for constructing functionally relevant phylogenetic profiles, and suggest that differences in the evolutionary history between different kingdoms might generally limit the usefulness of phylogenetic profiling in eukaryotes. |
format | Text |
id | pubmed-1592128 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-15921282006-10-05 Comparative assessment of performance and genome dependence among phylogenetic profiling methods Snitkin, Evan S Gustafson, Adam M Mellor, Joseph Wu, Jie DeLisi, Charles BMC Bioinformatics Research Article BACKGROUND: The rapidly increasing speed with which genome sequence data can be generated will be accompanied by an exponential increase in the number of sequenced eukaryotes. With the increasing number of sequenced eukaryotic genomes comes a need for bioinformatic techniques to aid in functional annotation. Ideally, genome context based techniques such as proximity, fusion, and phylogenetic profiling, which have been so successful in prokaryotes, could be utilized in eukaryotes. Here we explore the application of phylogenetic profiling, a method that exploits the evolutionary co-occurrence of genes in the assignment of functional linkages, to eukaryotic genomes. RESULTS: In order to evaluate the performance of phylogenetic profiling in eukaryotes, we assessed the relative performance of commonly used profile construction techniques and genome compositions in predicting functional linkages in both prokaryotic and eukaryotic organisms. When predicting linkages in E. coli with a prokaryotic profile, the use of continuous values constructed from transformed BLAST bit-scores performed better than profiles composed of discretized E-values; the use of discretized E-values resulted in more accurate linkages when using S. cerevisiae as the query organism. Extending this analysis by incorporating several eukaryotic genomes in profiles containing a majority of prokaryotes resulted in similar overall accuracy, but with a surprising reduction in pathway diversity among the most significant linkages. Furthermore, the application of phylogenetic profiling using profiles composed of only eukaryotes resulted in the loss of the strong correlation between common KEGG pathway membership and profile similarity score. Profile construction methods, orthology definitions, ontology and domain complexity were explored as possible sources of the poor performance of eukaryotic profiles, but with no improvement in results. CONCLUSION: Given the current set of completely sequenced eukaryotic organisms, phylogenetic profiling using profiles generated from any of the commonly used techniques was found to yield extremely poor results. These findings imply genome-specific requirements for constructing functionally relevant phylogenetic profiles, and suggest that differences in the evolutionary history between different kingdoms might generally limit the usefulness of phylogenetic profiling in eukaryotes. BioMed Central 2006-09-27 /pmc/articles/PMC1592128/ /pubmed/17005048 http://dx.doi.org/10.1186/1471-2105-7-420 Text en Copyright © 2006 Snitkin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Snitkin, Evan S Gustafson, Adam M Mellor, Joseph Wu, Jie DeLisi, Charles Comparative assessment of performance and genome dependence among phylogenetic profiling methods |
title | Comparative assessment of performance and genome dependence among phylogenetic profiling methods |
title_full | Comparative assessment of performance and genome dependence among phylogenetic profiling methods |
title_fullStr | Comparative assessment of performance and genome dependence among phylogenetic profiling methods |
title_full_unstemmed | Comparative assessment of performance and genome dependence among phylogenetic profiling methods |
title_short | Comparative assessment of performance and genome dependence among phylogenetic profiling methods |
title_sort | comparative assessment of performance and genome dependence among phylogenetic profiling methods |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1592128/ https://www.ncbi.nlm.nih.gov/pubmed/17005048 http://dx.doi.org/10.1186/1471-2105-7-420 |
work_keys_str_mv | AT snitkinevans comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods AT gustafsonadamm comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods AT mellorjoseph comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods AT wujie comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods AT delisicharles comparativeassessmentofperformanceandgenomedependenceamongphylogeneticprofilingmethods |