Cargando…

GS(2): an efficiently computable measure of GO-based similarity of gene sets

Motivation: The growing availability of genome-scale datasets has attracted increasing attention to the development of computational methods for automated inference of functional similarities among genes and their products. One class of such methods measures the functional similarity of genes based...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ruths, Troy, Ruths, Derek, Nakhleh, Luay
Formato:	Texto
Lenguaje:	English
Publicado:	Oxford University Press 2009
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2672633/ https://www.ncbi.nlm.nih.gov/pubmed/19289444 http://dx.doi.org/10.1093/bioinformatics/btp128

_version_	1782166549482700800
author	Ruths, Troy Ruths, Derek Nakhleh, Luay
author_facet	Ruths, Troy Ruths, Derek Nakhleh, Luay
author_sort	Ruths, Troy
collection	PubMed
description	Motivation: The growing availability of genome-scale datasets has attracted increasing attention to the development of computational methods for automated inference of functional similarities among genes and their products. One class of such methods measures the functional similarity of genes based on their distance in the Gene Ontology (GO). To measure the functional relatedness of a gene set, these measures consider every pair of genes in the set, and the average of all pairwise distances is calculated. However, as more data becomes available and gene sets used for analysis become larger, such pair-based calculation becomes prohibitive. Results: In this article, we propose GS(2) (GO-based similarity of gene sets), a novel GO-based measure of gene set similarity that is computable in linear time in the size of the gene set. The measure quantifies the similarity of the GO annotations among a set of genes by averaging the contribution of each gene's GO terms and their ancestor terms with respect to the GO vocabulary graph. To study the performance of our method, we compared our measure with an established pair-based measure when run on gene sets with varying degrees of functional similarities. In addition to a significant speed improvement, our method produced comparable similarity scores to the established method. Our method is available as a web-based tool and an open-source Python library. Availability: The web-based tools and Python code are available at: http://bioserver.cs.rice.edu/gs2. Contact: troy.ruths@rice.edu
format	Text
id	pubmed-2672633
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-26726332009-04-29 GS(2): an efficiently computable measure of GO-based similarity of gene sets Ruths, Troy Ruths, Derek Nakhleh, Luay Bioinformatics Original Papers Motivation: The growing availability of genome-scale datasets has attracted increasing attention to the development of computational methods for automated inference of functional similarities among genes and their products. One class of such methods measures the functional similarity of genes based on their distance in the Gene Ontology (GO). To measure the functional relatedness of a gene set, these measures consider every pair of genes in the set, and the average of all pairwise distances is calculated. However, as more data becomes available and gene sets used for analysis become larger, such pair-based calculation becomes prohibitive. Results: In this article, we propose GS(2) (GO-based similarity of gene sets), a novel GO-based measure of gene set similarity that is computable in linear time in the size of the gene set. The measure quantifies the similarity of the GO annotations among a set of genes by averaging the contribution of each gene's GO terms and their ancestor terms with respect to the GO vocabulary graph. To study the performance of our method, we compared our measure with an established pair-based measure when run on gene sets with varying degrees of functional similarities. In addition to a significant speed improvement, our method produced comparable similarity scores to the established method. Our method is available as a web-based tool and an open-source Python library. Availability: The web-based tools and Python code are available at: http://bioserver.cs.rice.edu/gs2. Contact: troy.ruths@rice.edu Oxford University Press 2009-05-01 2009-03-16 /pmc/articles/PMC2672633/ /pubmed/19289444 http://dx.doi.org/10.1093/bioinformatics/btp128 Text en © 2009 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Ruths, Troy Ruths, Derek Nakhleh, Luay GS(2): an efficiently computable measure of GO-based similarity of gene sets
title	GS(2): an efficiently computable measure of GO-based similarity of gene sets
title_full	GS(2): an efficiently computable measure of GO-based similarity of gene sets
title_fullStr	GS(2): an efficiently computable measure of GO-based similarity of gene sets
title_full_unstemmed	GS(2): an efficiently computable measure of GO-based similarity of gene sets
title_short	GS(2): an efficiently computable measure of GO-based similarity of gene sets
title_sort	gs(2): an efficiently computable measure of go-based similarity of gene sets
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2672633/ https://www.ncbi.nlm.nih.gov/pubmed/19289444 http://dx.doi.org/10.1093/bioinformatics/btp128
work_keys_str_mv	AT ruthstroy gs2anefficientlycomputablemeasureofgobasedsimilarityofgenesets AT ruthsderek gs2anefficientlycomputablemeasureofgobasedsimilarityofgenesets AT nakhlehluay gs2anefficientlycomputablemeasureofgobasedsimilarityofgenesets

GS(2): an efficiently computable measure of GO-based similarity of gene sets

Ejemplares similares