Cargando…

A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization

Identification of essential proteins is very important for understanding the basic requirements to sustain a living organism. In recent years, there has been an increasing interest in using computational methods to predict essential proteins based on protein–protein interaction (PPI) networks or fus...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Zhihong, Jiang, Meiping, Wu, Dongjie, Zhang, Wang, Yan, Wei, Qu, Xilong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378176/
https://www.ncbi.nlm.nih.gov/pubmed/34422014
http://dx.doi.org/10.3389/fgene.2021.709660
_version_ 1783740788013793280
author Zhang, Zhihong
Jiang, Meiping
Wu, Dongjie
Zhang, Wang
Yan, Wei
Qu, Xilong
author_facet Zhang, Zhihong
Jiang, Meiping
Wu, Dongjie
Zhang, Wang
Yan, Wei
Qu, Xilong
author_sort Zhang, Zhihong
collection PubMed
description Identification of essential proteins is very important for understanding the basic requirements to sustain a living organism. In recent years, there has been an increasing interest in using computational methods to predict essential proteins based on protein–protein interaction (PPI) networks or fusing multiple biological information. However, it has been observed that existing PPI data have false-negative and false-positive data. The fusion of multiple biological information can reduce the influence of false data in PPI, but inevitably more noise data will be produced at the same time. In this article, we proposed a novel non-negative matrix tri-factorization (NMTF)-based model (NTMEP) to predict essential proteins. Firstly, a weighted PPI network is established only using the topology features of the network, so as to avoid more noise. To reduce the influence of false data (existing in PPI network) on performance of identify essential proteins, the NMTF technique, as a widely used recommendation algorithm, is performed to reconstruct a most optimized PPI network with more potential protein–protein interactions. Then, we use the PageRank algorithm to compute the final ranking score of each protein, in which subcellular localization and homologous information of proteins were used to calculate the initial scores. In addition, extensive experiments are performed on the publicly available datasets and the results indicate that our NTMEP model has better performance in predicting essential proteins against the start-of-the-art method. In this investigation, we demonstrated that the introduction of non-negative matrix tri-factorization technology can effectively improve the condition of the protein–protein interaction network, so as to reduce the negative impact of noise on the prediction. At the same time, this finding provides a more novel angle of view for other applications based on protein–protein interaction networks.
format Online
Article
Text
id pubmed-8378176
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-83781762021-08-21 A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization Zhang, Zhihong Jiang, Meiping Wu, Dongjie Zhang, Wang Yan, Wei Qu, Xilong Front Genet Genetics Identification of essential proteins is very important for understanding the basic requirements to sustain a living organism. In recent years, there has been an increasing interest in using computational methods to predict essential proteins based on protein–protein interaction (PPI) networks or fusing multiple biological information. However, it has been observed that existing PPI data have false-negative and false-positive data. The fusion of multiple biological information can reduce the influence of false data in PPI, but inevitably more noise data will be produced at the same time. In this article, we proposed a novel non-negative matrix tri-factorization (NMTF)-based model (NTMEP) to predict essential proteins. Firstly, a weighted PPI network is established only using the topology features of the network, so as to avoid more noise. To reduce the influence of false data (existing in PPI network) on performance of identify essential proteins, the NMTF technique, as a widely used recommendation algorithm, is performed to reconstruct a most optimized PPI network with more potential protein–protein interactions. Then, we use the PageRank algorithm to compute the final ranking score of each protein, in which subcellular localization and homologous information of proteins were used to calculate the initial scores. In addition, extensive experiments are performed on the publicly available datasets and the results indicate that our NTMEP model has better performance in predicting essential proteins against the start-of-the-art method. In this investigation, we demonstrated that the introduction of non-negative matrix tri-factorization technology can effectively improve the condition of the protein–protein interaction network, so as to reduce the negative impact of noise on the prediction. At the same time, this finding provides a more novel angle of view for other applications based on protein–protein interaction networks. Frontiers Media S.A. 2021-08-06 /pmc/articles/PMC8378176/ /pubmed/34422014 http://dx.doi.org/10.3389/fgene.2021.709660 Text en Copyright © 2021 Zhang, Jiang, Wu, Zhang, Yan and Qu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhang, Zhihong
Jiang, Meiping
Wu, Dongjie
Zhang, Wang
Yan, Wei
Qu, Xilong
A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization
title A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization
title_full A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization
title_fullStr A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization
title_full_unstemmed A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization
title_short A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization
title_sort novel method for identifying essential proteins based on non-negative matrix tri-factorization
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378176/
https://www.ncbi.nlm.nih.gov/pubmed/34422014
http://dx.doi.org/10.3389/fgene.2021.709660
work_keys_str_mv AT zhangzhihong anovelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT jiangmeiping anovelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT wudongjie anovelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT zhangwang anovelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT yanwei anovelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT quxilong anovelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT zhangzhihong novelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT jiangmeiping novelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT wudongjie novelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT zhangwang novelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT yanwei novelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization
AT quxilong novelmethodforidentifyingessentialproteinsbasedonnonnegativematrixtrifactorization