Cargando…

Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins

Identifying key proteins from protein-protein interaction (PPI) networks is one of the most fundamental and important tasks for computational biologists. However, the protein interactions obtained by high-throughput technology are characterized by a high false positive rate, which severely hinders t...

Descripción completa

Detalles Bibliográficos
Autores principales: Xue, Xiaoli, Zhang, Wei, Fan, Anjing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10121005/
https://www.ncbi.nlm.nih.gov/pubmed/37083829
http://dx.doi.org/10.1371/journal.pone.0284274
_version_ 1785029290913431552
author Xue, Xiaoli
Zhang, Wei
Fan, Anjing
author_facet Xue, Xiaoli
Zhang, Wei
Fan, Anjing
author_sort Xue, Xiaoli
collection PubMed
description Identifying key proteins from protein-protein interaction (PPI) networks is one of the most fundamental and important tasks for computational biologists. However, the protein interactions obtained by high-throughput technology are characterized by a high false positive rate, which severely hinders the prediction accuracy of the current computational methods. In this paper, we propose a novel strategy to identify key proteins by constructing reliable PPI networks. Five Gene Ontology (GO)-based semantic similarity measurements (Jiang, Lin, Rel, Resnik, and Wang) are used to calculate the confidence scores for protein pairs under three annotation terms (Molecular function (MF), Biological process (BP), and Cellular component (CC)). The protein pairs with low similarity values are assumed to be low-confidence links, and the refined PPI networks are constructed by filtering the low-confidence links. Six topology-based centrality methods (the BC, DC, EC, NC, SC, and aveNC) are applied to test the performance of the measurements under the original network and refined network. We systematically compare the performance of the five semantic similarity metrics with the three GO annotation terms on four benchmark datasets, and the simulation results show that the performance of these centrality methods under refined PPI networks is relatively better than that under the original networks. Resnik with a BP annotation term performs best among all five metrics with the three annotation terms. These findings suggest the importance of semantic similarity metrics in measuring the reliability of the links between proteins and highlight the Resnik metric with the BP annotation term as a favourable choice.
format Online
Article
Text
id pubmed-10121005
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-101210052023-04-22 Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins Xue, Xiaoli Zhang, Wei Fan, Anjing PLoS One Research Article Identifying key proteins from protein-protein interaction (PPI) networks is one of the most fundamental and important tasks for computational biologists. However, the protein interactions obtained by high-throughput technology are characterized by a high false positive rate, which severely hinders the prediction accuracy of the current computational methods. In this paper, we propose a novel strategy to identify key proteins by constructing reliable PPI networks. Five Gene Ontology (GO)-based semantic similarity measurements (Jiang, Lin, Rel, Resnik, and Wang) are used to calculate the confidence scores for protein pairs under three annotation terms (Molecular function (MF), Biological process (BP), and Cellular component (CC)). The protein pairs with low similarity values are assumed to be low-confidence links, and the refined PPI networks are constructed by filtering the low-confidence links. Six topology-based centrality methods (the BC, DC, EC, NC, SC, and aveNC) are applied to test the performance of the measurements under the original network and refined network. We systematically compare the performance of the five semantic similarity metrics with the three GO annotation terms on four benchmark datasets, and the simulation results show that the performance of these centrality methods under refined PPI networks is relatively better than that under the original networks. Resnik with a BP annotation term performs best among all five metrics with the three annotation terms. These findings suggest the importance of semantic similarity metrics in measuring the reliability of the links between proteins and highlight the Resnik metric with the BP annotation term as a favourable choice. Public Library of Science 2023-04-21 /pmc/articles/PMC10121005/ /pubmed/37083829 http://dx.doi.org/10.1371/journal.pone.0284274 Text en © 2023 Xue et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Xue, Xiaoli
Zhang, Wei
Fan, Anjing
Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins
title Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins
title_full Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins
title_fullStr Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins
title_full_unstemmed Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins
title_short Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins
title_sort comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10121005/
https://www.ncbi.nlm.nih.gov/pubmed/37083829
http://dx.doi.org/10.1371/journal.pone.0284274
work_keys_str_mv AT xuexiaoli comparativeanalysisofgeneontologybasedsemanticsimilaritymeasurementsfortheapplicationofidentifyingessentialproteins
AT zhangwei comparativeanalysisofgeneontologybasedsemanticsimilaritymeasurementsfortheapplicationofidentifyingessentialproteins
AT fananjing comparativeanalysisofgeneontologybasedsemanticsimilaritymeasurementsfortheapplicationofidentifyingessentialproteins