Cargando…

The relationship between protein sequences and their gene ontology functions

BACKGROUND: One main research challenge in the post-genomic era is to understand the relationship between protein sequences and their biological functions. In recent years, several automated annotation systems have been developed for the functional assignment of uncharacterized proteins. The underly...

Descripción completa

Detalles Bibliográficos
Autores principales: Duan, Zhong-Hui, Hughes, Brent, Reichel, Lothar, Perez, Dianne M, Shi, Ting
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1780109/
https://www.ncbi.nlm.nih.gov/pubmed/17217503
http://dx.doi.org/10.1186/1471-2105-7-S4-S11
_version_ 1782131846006439936
author Duan, Zhong-Hui
Hughes, Brent
Reichel, Lothar
Perez, Dianne M
Shi, Ting
author_facet Duan, Zhong-Hui
Hughes, Brent
Reichel, Lothar
Perez, Dianne M
Shi, Ting
author_sort Duan, Zhong-Hui
collection PubMed
description BACKGROUND: One main research challenge in the post-genomic era is to understand the relationship between protein sequences and their biological functions. In recent years, several automated annotation systems have been developed for the functional assignment of uncharacterized proteins. The underlying assumption of these systems is that similar sequences imply similar biological functions. However, it has been noted that matching sequences do not always infer similar functions. RESULTS: In this paper, we present the correlation between protein sequences and protein functions for the yeast proteome in the context of gene ontology. A novel measure is introduced to define the overall similarity between two protein sequences. The effects of the level as well as the size of a gene ontology group on the degree of similarity were studied. The similarity distributions at different levels of gene ontology trees are presented. To evaluate the theoretical prediction power of similar sequences, we computed the posterior probability of correct predictions. CONCLUSION: The results indicate that protein pairs of similar biological functions tend to have higher sequence similarity, although the similarity distribution in each functional group is heterogeneous and varies from group to group. We conclude that sequence similarity can serve as a key measure in protein function prediction. However, the resulting annotations must be verified through other means. A method that combines a broader range of measures is more likely to provide more accurate prediction. Our study indicates that the posterior probability of a correct prediction could serve as one of the key measures.
format Text
id pubmed-1780109
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17801092007-01-24 The relationship between protein sequences and their gene ontology functions Duan, Zhong-Hui Hughes, Brent Reichel, Lothar Perez, Dianne M Shi, Ting BMC Bioinformatics Research BACKGROUND: One main research challenge in the post-genomic era is to understand the relationship between protein sequences and their biological functions. In recent years, several automated annotation systems have been developed for the functional assignment of uncharacterized proteins. The underlying assumption of these systems is that similar sequences imply similar biological functions. However, it has been noted that matching sequences do not always infer similar functions. RESULTS: In this paper, we present the correlation between protein sequences and protein functions for the yeast proteome in the context of gene ontology. A novel measure is introduced to define the overall similarity between two protein sequences. The effects of the level as well as the size of a gene ontology group on the degree of similarity were studied. The similarity distributions at different levels of gene ontology trees are presented. To evaluate the theoretical prediction power of similar sequences, we computed the posterior probability of correct predictions. CONCLUSION: The results indicate that protein pairs of similar biological functions tend to have higher sequence similarity, although the similarity distribution in each functional group is heterogeneous and varies from group to group. We conclude that sequence similarity can serve as a key measure in protein function prediction. However, the resulting annotations must be verified through other means. A method that combines a broader range of measures is more likely to provide more accurate prediction. Our study indicates that the posterior probability of a correct prediction could serve as one of the key measures. BioMed Central 2006-12-12 /pmc/articles/PMC1780109/ /pubmed/17217503 http://dx.doi.org/10.1186/1471-2105-7-S4-S11 Text en Copyright © 2006 Duan et al; licensee BioMed Central Ltd http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Duan, Zhong-Hui
Hughes, Brent
Reichel, Lothar
Perez, Dianne M
Shi, Ting
The relationship between protein sequences and their gene ontology functions
title The relationship between protein sequences and their gene ontology functions
title_full The relationship between protein sequences and their gene ontology functions
title_fullStr The relationship between protein sequences and their gene ontology functions
title_full_unstemmed The relationship between protein sequences and their gene ontology functions
title_short The relationship between protein sequences and their gene ontology functions
title_sort relationship between protein sequences and their gene ontology functions
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1780109/
https://www.ncbi.nlm.nih.gov/pubmed/17217503
http://dx.doi.org/10.1186/1471-2105-7-S4-S11
work_keys_str_mv AT duanzhonghui therelationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT hughesbrent therelationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT reichellothar therelationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT perezdiannem therelationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT shiting therelationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT duanzhonghui relationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT hughesbrent relationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT reichellothar relationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT perezdiannem relationshipbetweenproteinsequencesandtheirgeneontologyfunctions
AT shiting relationshipbetweenproteinsequencesandtheirgeneontologyfunctions