Cargando…

Completing sparse and disconnected protein-protein network by deep learning

BACKGROUND: Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level p...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Lei, Liao, Li, Wu, Cathy H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5863833/
https://www.ncbi.nlm.nih.gov/pubmed/29566671
http://dx.doi.org/10.1186/s12859-018-2112-7
_version_ 1783308442548568064
author Huang, Lei
Liao, Li
Wu, Cathy H.
author_facet Huang, Lei
Liao, Li
Wu, Cathy H.
author_sort Huang, Lei
collection PubMed
description BACKGROUND: Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level prediction. Many of the existing network level methods predict PPIs under the assumption that the training network should be connected. However, this assumption greatly affects the prediction power and limits the application area because the current golden standard PPI networks are usually very sparse and disconnected. Therefore, how to effectively predict PPIs based on a training network that is sparse and disconnected remains a challenge. RESULTS: In this work, we developed a novel PPI prediction method based on deep learning neural network and regularized Laplacian kernel. We use a neural network with an autoencoder-like architecture to implicitly simulate the evolutionary processes of a PPI network. Neurons of the output layer correspond to proteins and are labeled with values (1 for interaction and 0 for otherwise) from the adjacency matrix of a sparse disconnected training PPI network. Unlike autoencoder, neurons at the input layer are given all zero input, reflecting an assumption of no a priori knowledge about PPIs, and hidden layers of smaller sizes mimic ancient interactome at different times during evolution. After the training step, an evolved PPI network whose rows are outputs of the neural network can be obtained. We then predict PPIs by applying the regularized Laplacian kernel to the transition matrix that is built upon the evolved PPI network. The results from cross-validation experiments show that the PPI prediction accuracies for yeast data and human data measured as AUC are increased by up to 8.4 and 14.9% respectively, as compared to the baseline. Moreover, the evolved PPI network can also help us leverage complementary information from the disconnected training network and multiple heterogeneous data sources. Tested by the yeast data with six heterogeneous feature kernels, the results show our method can further improve the prediction performance by up to 2%, which is very close to an upper bound that is obtained by an Approximate Bayesian Computation based sampling method. CONCLUSIONS: The proposed evolution deep neural network, coupled with regularized Laplacian kernel, is an effective tool in completing sparse and disconnected PPI networks and in facilitating integration of heterogeneous data sources.
format Online
Article
Text
id pubmed-5863833
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58638332018-03-27 Completing sparse and disconnected protein-protein network by deep learning Huang, Lei Liao, Li Wu, Cathy H. BMC Bioinformatics Research Article BACKGROUND: Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level prediction. Many of the existing network level methods predict PPIs under the assumption that the training network should be connected. However, this assumption greatly affects the prediction power and limits the application area because the current golden standard PPI networks are usually very sparse and disconnected. Therefore, how to effectively predict PPIs based on a training network that is sparse and disconnected remains a challenge. RESULTS: In this work, we developed a novel PPI prediction method based on deep learning neural network and regularized Laplacian kernel. We use a neural network with an autoencoder-like architecture to implicitly simulate the evolutionary processes of a PPI network. Neurons of the output layer correspond to proteins and are labeled with values (1 for interaction and 0 for otherwise) from the adjacency matrix of a sparse disconnected training PPI network. Unlike autoencoder, neurons at the input layer are given all zero input, reflecting an assumption of no a priori knowledge about PPIs, and hidden layers of smaller sizes mimic ancient interactome at different times during evolution. After the training step, an evolved PPI network whose rows are outputs of the neural network can be obtained. We then predict PPIs by applying the regularized Laplacian kernel to the transition matrix that is built upon the evolved PPI network. The results from cross-validation experiments show that the PPI prediction accuracies for yeast data and human data measured as AUC are increased by up to 8.4 and 14.9% respectively, as compared to the baseline. Moreover, the evolved PPI network can also help us leverage complementary information from the disconnected training network and multiple heterogeneous data sources. Tested by the yeast data with six heterogeneous feature kernels, the results show our method can further improve the prediction performance by up to 2%, which is very close to an upper bound that is obtained by an Approximate Bayesian Computation based sampling method. CONCLUSIONS: The proposed evolution deep neural network, coupled with regularized Laplacian kernel, is an effective tool in completing sparse and disconnected PPI networks and in facilitating integration of heterogeneous data sources. BioMed Central 2018-03-22 /pmc/articles/PMC5863833/ /pubmed/29566671 http://dx.doi.org/10.1186/s12859-018-2112-7 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Huang, Lei
Liao, Li
Wu, Cathy H.
Completing sparse and disconnected protein-protein network by deep learning
title Completing sparse and disconnected protein-protein network by deep learning
title_full Completing sparse and disconnected protein-protein network by deep learning
title_fullStr Completing sparse and disconnected protein-protein network by deep learning
title_full_unstemmed Completing sparse and disconnected protein-protein network by deep learning
title_short Completing sparse and disconnected protein-protein network by deep learning
title_sort completing sparse and disconnected protein-protein network by deep learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5863833/
https://www.ncbi.nlm.nih.gov/pubmed/29566671
http://dx.doi.org/10.1186/s12859-018-2112-7
work_keys_str_mv AT huanglei completingsparseanddisconnectedproteinproteinnetworkbydeeplearning
AT liaoli completingsparseanddisconnectedproteinproteinnetworkbydeeplearning
AT wucathyh completingsparseanddisconnectedproteinproteinnetworkbydeeplearning