Cargando…

Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method

BACKGROUND: Explicit comparisons based on the semantic similarity of Gene Ontology terms provide a quantitative way to measure the functional similarity between gene products and are widely applied in large-scale genomic research via integration with other models. Previously, we presented an edge-ba...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Xiaomei, Pang, Erli, Lin, Kui, Pei, Zhen-Ming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3669204/
https://www.ncbi.nlm.nih.gov/pubmed/23741529
http://dx.doi.org/10.1371/journal.pone.0066745
_version_ 1782271711575539712
author Wu, Xiaomei
Pang, Erli
Lin, Kui
Pei, Zhen-Ming
author_facet Wu, Xiaomei
Pang, Erli
Lin, Kui
Pei, Zhen-Ming
author_sort Wu, Xiaomei
collection PubMed
description BACKGROUND: Explicit comparisons based on the semantic similarity of Gene Ontology terms provide a quantitative way to measure the functional similarity between gene products and are widely applied in large-scale genomic research via integration with other models. Previously, we presented an edge-based method, Relative Specificity Similarity (RSS), which takes the global position of relevant terms into account. However, edge-based semantic similarity metrics are sensitive to the intrinsic structure of GO and simply consider terms at the same level in the ontology to be equally specific nodes, revealing the weaknesses that could be complemented using information content (IC). RESULTS AND CONCLUSIONS: Here, we used the IC-based nodes to improve RSS and proposed a new method, Hybrid Relative Specificity Similarity (HRSS). HRSS outperformed other methods in distinguishing true protein-protein interactions from false. HRSS values were divided into four different levels of confidence for protein interactions. In addition, HRSS was statistically the best at obtaining the highest average functional similarity among human-mouse orthologs. Both HRSS and the groupwise measure, simGIC, are superior in correlation with sequence and Pfam similarities. Because different measures are best suited for different circumstances, we compared two pairwise strategies, the maximum and the best-match average, in the evaluation. The former was more effective at inferring physical protein-protein interactions, and the latter at estimating the functional conservation of orthologs and analyzing the CESSM datasets. In conclusion, HRSS can be applied to different biological problems by quantifying the functional similarity between gene products. The algorithm HRSS was implemented in the C programming language, which is freely available from http://cmb.bnu.edu.cn/hrss.
format Online
Article
Text
id pubmed-3669204
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36692042013-06-05 Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method Wu, Xiaomei Pang, Erli Lin, Kui Pei, Zhen-Ming PLoS One Research Article BACKGROUND: Explicit comparisons based on the semantic similarity of Gene Ontology terms provide a quantitative way to measure the functional similarity between gene products and are widely applied in large-scale genomic research via integration with other models. Previously, we presented an edge-based method, Relative Specificity Similarity (RSS), which takes the global position of relevant terms into account. However, edge-based semantic similarity metrics are sensitive to the intrinsic structure of GO and simply consider terms at the same level in the ontology to be equally specific nodes, revealing the weaknesses that could be complemented using information content (IC). RESULTS AND CONCLUSIONS: Here, we used the IC-based nodes to improve RSS and proposed a new method, Hybrid Relative Specificity Similarity (HRSS). HRSS outperformed other methods in distinguishing true protein-protein interactions from false. HRSS values were divided into four different levels of confidence for protein interactions. In addition, HRSS was statistically the best at obtaining the highest average functional similarity among human-mouse orthologs. Both HRSS and the groupwise measure, simGIC, are superior in correlation with sequence and Pfam similarities. Because different measures are best suited for different circumstances, we compared two pairwise strategies, the maximum and the best-match average, in the evaluation. The former was more effective at inferring physical protein-protein interactions, and the latter at estimating the functional conservation of orthologs and analyzing the CESSM datasets. In conclusion, HRSS can be applied to different biological problems by quantifying the functional similarity between gene products. The algorithm HRSS was implemented in the C programming language, which is freely available from http://cmb.bnu.edu.cn/hrss. Public Library of Science 2013-05-31 /pmc/articles/PMC3669204/ /pubmed/23741529 http://dx.doi.org/10.1371/journal.pone.0066745 Text en © 2013 Wu et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Wu, Xiaomei
Pang, Erli
Lin, Kui
Pei, Zhen-Ming
Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method
title Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method
title_full Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method
title_fullStr Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method
title_full_unstemmed Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method
title_short Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method
title_sort improving the measurement of semantic similarity between gene ontology terms and gene products: insights from an edge- and ic-based hybrid method
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3669204/
https://www.ncbi.nlm.nih.gov/pubmed/23741529
http://dx.doi.org/10.1371/journal.pone.0066745
work_keys_str_mv AT wuxiaomei improvingthemeasurementofsemanticsimilaritybetweengeneontologytermsandgeneproductsinsightsfromanedgeandicbasedhybridmethod
AT pangerli improvingthemeasurementofsemanticsimilaritybetweengeneontologytermsandgeneproductsinsightsfromanedgeandicbasedhybridmethod
AT linkui improvingthemeasurementofsemanticsimilaritybetweengeneontologytermsandgeneproductsinsightsfromanedgeandicbasedhybridmethod
AT peizhenming improvingthemeasurementofsemanticsimilaritybetweengeneontologytermsandgeneproductsinsightsfromanedgeandicbasedhybridmethod