Cargando…

Quantitative assessment of relationship between sequence similarity and function similarity

BACKGROUND: Comparative sequence analysis is considered as the first step towards annotating new proteins in genome annotation. However, sequence comparison may lead to creation and propagation of function assignment errors. Thus, it is important to perform a thorough analysis for the quality of seq...

Descripción completa

Detalles Bibliográficos
Autores principales: Joshi, Trupti, Xu, Dong
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1949826/
https://www.ncbi.nlm.nih.gov/pubmed/17620139
http://dx.doi.org/10.1186/1471-2164-8-222
_version_ 1782134522145406976
author Joshi, Trupti
Xu, Dong
author_facet Joshi, Trupti
Xu, Dong
author_sort Joshi, Trupti
collection PubMed
description BACKGROUND: Comparative sequence analysis is considered as the first step towards annotating new proteins in genome annotation. However, sequence comparison may lead to creation and propagation of function assignment errors. Thus, it is important to perform a thorough analysis for the quality of sequence-based function assignment using large-scale data in a systematic way. RESULTS: We present an analysis of the relationship between sequence similarity and function similarity for the proteins in four model organisms, i.e., Arabidopsis thaliana, Saccharomyces cerevisiae, Caenorrhabditis elegans, and Drosophila melanogaster. Using a measure of functional similarity based on the three categories of Gene Ontology (GO) classifications (biological process, molecular function, and cellular component), we quantified the correlation between functional similarity and sequence similarity measured by sequence identity or statistical significance of the alignment and compared such a correlation against randomly chosen protein pairs. CONCLUSION: Various sequence-function relationships were identified from BLAST versus PSI-BLAST, sequence identity versus Expectation Value, GO indices versus semantic similarity approaches, and within genome versus between genome comparisons, for the three GO categories. Our study provides a benchmark to estimate the confidence in assignment of functions purely based on sequence similarity.
format Text
id pubmed-1949826
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-19498262007-08-17 Quantitative assessment of relationship between sequence similarity and function similarity Joshi, Trupti Xu, Dong BMC Genomics Research Article BACKGROUND: Comparative sequence analysis is considered as the first step towards annotating new proteins in genome annotation. However, sequence comparison may lead to creation and propagation of function assignment errors. Thus, it is important to perform a thorough analysis for the quality of sequence-based function assignment using large-scale data in a systematic way. RESULTS: We present an analysis of the relationship between sequence similarity and function similarity for the proteins in four model organisms, i.e., Arabidopsis thaliana, Saccharomyces cerevisiae, Caenorrhabditis elegans, and Drosophila melanogaster. Using a measure of functional similarity based on the three categories of Gene Ontology (GO) classifications (biological process, molecular function, and cellular component), we quantified the correlation between functional similarity and sequence similarity measured by sequence identity or statistical significance of the alignment and compared such a correlation against randomly chosen protein pairs. CONCLUSION: Various sequence-function relationships were identified from BLAST versus PSI-BLAST, sequence identity versus Expectation Value, GO indices versus semantic similarity approaches, and within genome versus between genome comparisons, for the three GO categories. Our study provides a benchmark to estimate the confidence in assignment of functions purely based on sequence similarity. BioMed Central 2007-07-09 /pmc/articles/PMC1949826/ /pubmed/17620139 http://dx.doi.org/10.1186/1471-2164-8-222 Text en Copyright © 2007 Joshi and Xu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Joshi, Trupti
Xu, Dong
Quantitative assessment of relationship between sequence similarity and function similarity
title Quantitative assessment of relationship between sequence similarity and function similarity
title_full Quantitative assessment of relationship between sequence similarity and function similarity
title_fullStr Quantitative assessment of relationship between sequence similarity and function similarity
title_full_unstemmed Quantitative assessment of relationship between sequence similarity and function similarity
title_short Quantitative assessment of relationship between sequence similarity and function similarity
title_sort quantitative assessment of relationship between sequence similarity and function similarity
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1949826/
https://www.ncbi.nlm.nih.gov/pubmed/17620139
http://dx.doi.org/10.1186/1471-2164-8-222
work_keys_str_mv AT joshitrupti quantitativeassessmentofrelationshipbetweensequencesimilarityandfunctionsimilarity
AT xudong quantitativeassessmentofrelationshipbetweensequencesimilarityandfunctionsimilarity