Cargando…

Gene annotation and network inference by phylogenetic profiling

BACKGROUND: Phylogenetic analysis is emerging as one of the most informative computational methods for the annotation of genes and identification of evolutionary modules of functionally related genes. The effectiveness with which phylogenetic profiles can be utilized to assign genes to pathways depe...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jie, Hu, Zhenjun, DeLisi, Charles
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1388238/
https://www.ncbi.nlm.nih.gov/pubmed/16503966
http://dx.doi.org/10.1186/1471-2105-7-80
_version_ 1782126907650736128
author Wu, Jie
Hu, Zhenjun
DeLisi, Charles
author_facet Wu, Jie
Hu, Zhenjun
DeLisi, Charles
author_sort Wu, Jie
collection PubMed
description BACKGROUND: Phylogenetic analysis is emerging as one of the most informative computational methods for the annotation of genes and identification of evolutionary modules of functionally related genes. The effectiveness with which phylogenetic profiles can be utilized to assign genes to pathways depends on an appropriate measure of correlation between gene profiles, and an effective decision rule to use the correlate. Current methods, though useful, perform at a level well below what is possible, largely because performance of the latter deteriorates rapidly as coverage increases. RESULTS: We introduce, test and apply a new decision rule, correlation enrichment (CE), for assigning genes to functional categories at various levels of resolution. Among the results are: (1) CE performs better than standard guilt by association (SGA, assignment to a functional category when a simple correlate exceeds a pre-specified threshold) irrespective of the number of genes assigned (i.e. coverage); improvement is greatest at high coverage where precision (positive predictive value) of CE is approximately 6-fold higher than that of SGA. (2) CE is estimated to allocate each of the 2918 unannotated orthologs to KEGG pathways with an average precision of 49% (approximately 7-fold higher than SGA) (3) An estimated 94% of the 1846 unannotated orthologs in the COG ontology can be assigned a function with an average precision of 0.4 or greater. (4) Dozens of functional and evolutionarily conserved cliques or quasi-cliques can be identified, many having previously unannotated genes. CONCLUSION: The method serves as a general computational tool for annotating large numbers of unknown genes, uncovering evolutionary and functional modules. It appears to perform substantially better than extant stand alone high throughout methods.
format Text
id pubmed-1388238
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-13882382006-04-21 Gene annotation and network inference by phylogenetic profiling Wu, Jie Hu, Zhenjun DeLisi, Charles BMC Bioinformatics Methodology Article BACKGROUND: Phylogenetic analysis is emerging as one of the most informative computational methods for the annotation of genes and identification of evolutionary modules of functionally related genes. The effectiveness with which phylogenetic profiles can be utilized to assign genes to pathways depends on an appropriate measure of correlation between gene profiles, and an effective decision rule to use the correlate. Current methods, though useful, perform at a level well below what is possible, largely because performance of the latter deteriorates rapidly as coverage increases. RESULTS: We introduce, test and apply a new decision rule, correlation enrichment (CE), for assigning genes to functional categories at various levels of resolution. Among the results are: (1) CE performs better than standard guilt by association (SGA, assignment to a functional category when a simple correlate exceeds a pre-specified threshold) irrespective of the number of genes assigned (i.e. coverage); improvement is greatest at high coverage where precision (positive predictive value) of CE is approximately 6-fold higher than that of SGA. (2) CE is estimated to allocate each of the 2918 unannotated orthologs to KEGG pathways with an average precision of 49% (approximately 7-fold higher than SGA) (3) An estimated 94% of the 1846 unannotated orthologs in the COG ontology can be assigned a function with an average precision of 0.4 or greater. (4) Dozens of functional and evolutionarily conserved cliques or quasi-cliques can be identified, many having previously unannotated genes. CONCLUSION: The method serves as a general computational tool for annotating large numbers of unknown genes, uncovering evolutionary and functional modules. It appears to perform substantially better than extant stand alone high throughout methods. BioMed Central 2006-02-17 /pmc/articles/PMC1388238/ /pubmed/16503966 http://dx.doi.org/10.1186/1471-2105-7-80 Text en Copyright © 2006 Wu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Wu, Jie
Hu, Zhenjun
DeLisi, Charles
Gene annotation and network inference by phylogenetic profiling
title Gene annotation and network inference by phylogenetic profiling
title_full Gene annotation and network inference by phylogenetic profiling
title_fullStr Gene annotation and network inference by phylogenetic profiling
title_full_unstemmed Gene annotation and network inference by phylogenetic profiling
title_short Gene annotation and network inference by phylogenetic profiling
title_sort gene annotation and network inference by phylogenetic profiling
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1388238/
https://www.ncbi.nlm.nih.gov/pubmed/16503966
http://dx.doi.org/10.1186/1471-2105-7-80
work_keys_str_mv AT wujie geneannotationandnetworkinferencebyphylogeneticprofiling
AT huzhenjun geneannotationandnetworkinferencebyphylogeneticprofiling
AT delisicharles geneannotationandnetworkinferencebyphylogeneticprofiling