Cargando…

Improved prediction of missing protein interactome links via anomaly detection

Interactomes such as Protein interaction networks have many undiscovered links between entities. Experimental verification of every link in these networks is prohibitively expensive, and therefore computational methods to direct the search for possible links are of great value. The problem of findin...

Descripción completa

Detalles Bibliográficos
Autores principales:	Singh, Kushal Veer, Vig, Lovekesh
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2017
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245231/ https://www.ncbi.nlm.nih.gov/pubmed/30533510 http://dx.doi.org/10.1007/s41109-017-0022-7

_version_	1783372199552352256
author	Singh, Kushal Veer Vig, Lovekesh
author_facet	Singh, Kushal Veer Vig, Lovekesh
author_sort	Singh, Kushal Veer
collection	PubMed
description	Interactomes such as Protein interaction networks have many undiscovered links between entities. Experimental verification of every link in these networks is prohibitively expensive, and therefore computational methods to direct the search for possible links are of great value. The problem of finding undiscovered links in a network is also referred to as the link prediction problem. A popular approach for link prediction has been to formulate it as a binary classification problem in which class labels indicate the existence or absence of a link (we refer to these as positive links or negative links respectively) between a pair of nodes in the network. Researchers have successfully applied such supervised classification techniques to determine the presence of links in protein interaction networks. However, it is quite common for protein-protein interaction (PPI) networks to have a large proportion of undiscovered links. Thus, a link prediction approach could incorrectly treat undiscovered positive links as negative links, thereby introducing a bias in the learning. In this paper, we propose to denoise the class of negative links in the training data via a Gaussian process anomaly detector. We show that this significantly reduces the noise due to mislabelled negative links and improves the resulting link prediction accuracy. We evaluate the approach by introducing synthetic noise into the PPI networks and measuring how accurately we can reconstruct the original PPI networks using classifiers trained on both noisy and denoised data. Experiments were performed with five different PPI network datasets and the results indicate a significant reduction in bias due to label noise, and more importantly, a significant improvement in the accuracy of detecting missing links via classification.
format	Online Article Text
id	pubmed-6245231
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-62452312018-12-06 Improved prediction of missing protein interactome links via anomaly detection Singh, Kushal Veer Vig, Lovekesh Appl Netw Sci Research Interactomes such as Protein interaction networks have many undiscovered links between entities. Experimental verification of every link in these networks is prohibitively expensive, and therefore computational methods to direct the search for possible links are of great value. The problem of finding undiscovered links in a network is also referred to as the link prediction problem. A popular approach for link prediction has been to formulate it as a binary classification problem in which class labels indicate the existence or absence of a link (we refer to these as positive links or negative links respectively) between a pair of nodes in the network. Researchers have successfully applied such supervised classification techniques to determine the presence of links in protein interaction networks. However, it is quite common for protein-protein interaction (PPI) networks to have a large proportion of undiscovered links. Thus, a link prediction approach could incorrectly treat undiscovered positive links as negative links, thereby introducing a bias in the learning. In this paper, we propose to denoise the class of negative links in the training data via a Gaussian process anomaly detector. We show that this significantly reduces the noise due to mislabelled negative links and improves the resulting link prediction accuracy. We evaluate the approach by introducing synthetic noise into the PPI networks and measuring how accurately we can reconstruct the original PPI networks using classifiers trained on both noisy and denoised data. Experiments were performed with five different PPI network datasets and the results indicate a significant reduction in bias due to label noise, and more importantly, a significant improvement in the accuracy of detecting missing links via classification. Springer International Publishing 2017-01-28 2017 /pmc/articles/PMC6245231/ /pubmed/30533510 http://dx.doi.org/10.1007/s41109-017-0022-7 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Research Singh, Kushal Veer Vig, Lovekesh Improved prediction of missing protein interactome links via anomaly detection
title	Improved prediction of missing protein interactome links via anomaly detection
title_full	Improved prediction of missing protein interactome links via anomaly detection
title_fullStr	Improved prediction of missing protein interactome links via anomaly detection
title_full_unstemmed	Improved prediction of missing protein interactome links via anomaly detection
title_short	Improved prediction of missing protein interactome links via anomaly detection
title_sort	improved prediction of missing protein interactome links via anomaly detection
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245231/ https://www.ncbi.nlm.nih.gov/pubmed/30533510 http://dx.doi.org/10.1007/s41109-017-0022-7
work_keys_str_mv	AT singhkushalveer improvedpredictionofmissingproteininteractomelinksviaanomalydetection AT viglovekesh improvedpredictionofmissingproteininteractomelinksviaanomalydetection

Improved prediction of missing protein interactome links via anomaly detection

Ejemplares similares