Cargando…

Gene Ontology term overlap as a measure of gene functional similarity

BACKGROUND: The availability of various high-throughput experimental and computational methods allows biologists to rapidly infer functional relationships between genes. It is often necessary to evaluate these predictions computationally, a task that requires a reference database for functional rela...

Descripción completa

Detalles Bibliográficos
Autores principales: Mistry, Meeta, Pavlidis, Paul
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2518162/
https://www.ncbi.nlm.nih.gov/pubmed/18680592
http://dx.doi.org/10.1186/1471-2105-9-327
_version_ 1782158547961774080
author Mistry, Meeta
Pavlidis, Paul
author_facet Mistry, Meeta
Pavlidis, Paul
author_sort Mistry, Meeta
collection PubMed
description BACKGROUND: The availability of various high-throughput experimental and computational methods allows biologists to rapidly infer functional relationships between genes. It is often necessary to evaluate these predictions computationally, a task that requires a reference database for functional relatedness. One such reference is the Gene Ontology (GO). A number of groups have suggested that the semantic similarity of the GO annotations of genes can serve as a proxy for functional relatedness. Here we evaluate a simple measure of semantic similarity, term overlap (TO). RESULTS: We computed the TO for randomly selected gene pairs from the mouse genome. For comparison, we implemented six previously reported semantic similarity measures that share the feature of using computation of probabilities of terms to infer information content, in addition to three vector based approaches and a normalized version of the TO measure. We find that the overlap measure is highly correlated with the others but differs in detail. TO is at least as good a predictor of sequence similarity as the other measures. We further show that term overlap may avoid some problems that affect the probability-based measures. Term overlap is also much faster to compute than the information content-based measures. CONCLUSION: Our experiments suggest that term overlap can serve as a simple and fast alternative to other approaches which use explicit information content estimation or require complex pre-calculations, while also avoiding problems that some other measures may encounter.
format Text
id pubmed-2518162
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25181622008-08-20 Gene Ontology term overlap as a measure of gene functional similarity Mistry, Meeta Pavlidis, Paul BMC Bioinformatics Research Article BACKGROUND: The availability of various high-throughput experimental and computational methods allows biologists to rapidly infer functional relationships between genes. It is often necessary to evaluate these predictions computationally, a task that requires a reference database for functional relatedness. One such reference is the Gene Ontology (GO). A number of groups have suggested that the semantic similarity of the GO annotations of genes can serve as a proxy for functional relatedness. Here we evaluate a simple measure of semantic similarity, term overlap (TO). RESULTS: We computed the TO for randomly selected gene pairs from the mouse genome. For comparison, we implemented six previously reported semantic similarity measures that share the feature of using computation of probabilities of terms to infer information content, in addition to three vector based approaches and a normalized version of the TO measure. We find that the overlap measure is highly correlated with the others but differs in detail. TO is at least as good a predictor of sequence similarity as the other measures. We further show that term overlap may avoid some problems that affect the probability-based measures. Term overlap is also much faster to compute than the information content-based measures. CONCLUSION: Our experiments suggest that term overlap can serve as a simple and fast alternative to other approaches which use explicit information content estimation or require complex pre-calculations, while also avoiding problems that some other measures may encounter. BioMed Central 2008-08-04 /pmc/articles/PMC2518162/ /pubmed/18680592 http://dx.doi.org/10.1186/1471-2105-9-327 Text en Copyright © 2008 Mistry and Pavlidis; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Mistry, Meeta
Pavlidis, Paul
Gene Ontology term overlap as a measure of gene functional similarity
title Gene Ontology term overlap as a measure of gene functional similarity
title_full Gene Ontology term overlap as a measure of gene functional similarity
title_fullStr Gene Ontology term overlap as a measure of gene functional similarity
title_full_unstemmed Gene Ontology term overlap as a measure of gene functional similarity
title_short Gene Ontology term overlap as a measure of gene functional similarity
title_sort gene ontology term overlap as a measure of gene functional similarity
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2518162/
https://www.ncbi.nlm.nih.gov/pubmed/18680592
http://dx.doi.org/10.1186/1471-2105-9-327
work_keys_str_mv AT mistrymeeta geneontologytermoverlapasameasureofgenefunctionalsimilarity
AT pavlidispaul geneontologytermoverlapasameasureofgenefunctionalsimilarity