Cargando…

An iteration method for identifying yeast essential proteins from heterogeneous network

BACKGROUND: Essential proteins are distinctly important for an organism’s survival and development and crucial to disease analysis and drug design as well. Large-scale protein-protein interaction (PPI) data sets exist in Saccharomyces cerevisiae, which provides us with a valuable opportunity to pred...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Bihai, Zhao, Yulin, Zhang, Xiaoxia, Zhang, Zhihong, Zhang, Fan, Wang, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6591974/
https://www.ncbi.nlm.nih.gov/pubmed/31234779
http://dx.doi.org/10.1186/s12859-019-2930-2
_version_ 1783429819178942464
author Zhao, Bihai
Zhao, Yulin
Zhang, Xiaoxia
Zhang, Zhihong
Zhang, Fan
Wang, Lei
author_facet Zhao, Bihai
Zhao, Yulin
Zhang, Xiaoxia
Zhang, Zhihong
Zhang, Fan
Wang, Lei
author_sort Zhao, Bihai
collection PubMed
description BACKGROUND: Essential proteins are distinctly important for an organism’s survival and development and crucial to disease analysis and drug design as well. Large-scale protein-protein interaction (PPI) data sets exist in Saccharomyces cerevisiae, which provides us with a valuable opportunity to predict identify essential proteins from PPI networks. Many network topology-based computational methods have been designed to detect essential proteins. However, these methods are limited by the completeness of available PPI data. To break out of these restraints, some computational methods have been proposed by integrating PPI networks and multi-source biological data. Despite the progress in the research of multiple data fusion, it is still challenging to improve the prediction accuracy of the computational methods. RESULTS: In this paper, we design a novel iterative model for essential proteins prediction, named Randomly Walking in the Heterogeneous Network (RWHN). In RWHN, a weighted protein-protein interaction network and a domain-domain association network are constructed according to the original PPI network and the known protein-domain association network, firstly. And then, we establish a new heterogeneous matrix by combining the two constructed networks with the protein-domain association network. Based on the heterogeneous matrix, a transition probability matrix is established by normalized operation. Finally, an improved PageRank algorithm is adopted on the heterogeneous network for essential proteins prediction. In order to eliminate the influence of the false negative, information on orthologous proteins and the subcellular localization information of proteins are integrated to initialize the score vector of proteins. In RWHN, the topology, conservative and functional features of essential proteins are all taken into account in the prediction process. The experimental results show that RWHN obviously exceeds in predicting essential proteins ten other competing methods. CONCLUSIONS: We demonstrated that integrating multi-source data into a heterogeneous network can preserve the complex relationship among multiple biological data and improve the prediction accuracy of essential proteins. RWHN, our proposed method, is effective for the prediction of essential proteins.
format Online
Article
Text
id pubmed-6591974
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65919742019-07-08 An iteration method for identifying yeast essential proteins from heterogeneous network Zhao, Bihai Zhao, Yulin Zhang, Xiaoxia Zhang, Zhihong Zhang, Fan Wang, Lei BMC Bioinformatics Research Article BACKGROUND: Essential proteins are distinctly important for an organism’s survival and development and crucial to disease analysis and drug design as well. Large-scale protein-protein interaction (PPI) data sets exist in Saccharomyces cerevisiae, which provides us with a valuable opportunity to predict identify essential proteins from PPI networks. Many network topology-based computational methods have been designed to detect essential proteins. However, these methods are limited by the completeness of available PPI data. To break out of these restraints, some computational methods have been proposed by integrating PPI networks and multi-source biological data. Despite the progress in the research of multiple data fusion, it is still challenging to improve the prediction accuracy of the computational methods. RESULTS: In this paper, we design a novel iterative model for essential proteins prediction, named Randomly Walking in the Heterogeneous Network (RWHN). In RWHN, a weighted protein-protein interaction network and a domain-domain association network are constructed according to the original PPI network and the known protein-domain association network, firstly. And then, we establish a new heterogeneous matrix by combining the two constructed networks with the protein-domain association network. Based on the heterogeneous matrix, a transition probability matrix is established by normalized operation. Finally, an improved PageRank algorithm is adopted on the heterogeneous network for essential proteins prediction. In order to eliminate the influence of the false negative, information on orthologous proteins and the subcellular localization information of proteins are integrated to initialize the score vector of proteins. In RWHN, the topology, conservative and functional features of essential proteins are all taken into account in the prediction process. The experimental results show that RWHN obviously exceeds in predicting essential proteins ten other competing methods. CONCLUSIONS: We demonstrated that integrating multi-source data into a heterogeneous network can preserve the complex relationship among multiple biological data and improve the prediction accuracy of essential proteins. RWHN, our proposed method, is effective for the prediction of essential proteins. BioMed Central 2019-06-24 /pmc/articles/PMC6591974/ /pubmed/31234779 http://dx.doi.org/10.1186/s12859-019-2930-2 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zhao, Bihai
Zhao, Yulin
Zhang, Xiaoxia
Zhang, Zhihong
Zhang, Fan
Wang, Lei
An iteration method for identifying yeast essential proteins from heterogeneous network
title An iteration method for identifying yeast essential proteins from heterogeneous network
title_full An iteration method for identifying yeast essential proteins from heterogeneous network
title_fullStr An iteration method for identifying yeast essential proteins from heterogeneous network
title_full_unstemmed An iteration method for identifying yeast essential proteins from heterogeneous network
title_short An iteration method for identifying yeast essential proteins from heterogeneous network
title_sort iteration method for identifying yeast essential proteins from heterogeneous network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6591974/
https://www.ncbi.nlm.nih.gov/pubmed/31234779
http://dx.doi.org/10.1186/s12859-019-2930-2
work_keys_str_mv AT zhaobihai aniterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhaoyulin aniterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhangxiaoxia aniterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhangzhihong aniterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhangfan aniterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT wanglei aniterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhaobihai iterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhaoyulin iterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhangxiaoxia iterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhangzhihong iterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT zhangfan iterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork
AT wanglei iterationmethodforidentifyingyeastessentialproteinsfromheterogeneousnetwork