Cargando…

Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization

BACKGROUND: Predicting functional properties of proteins in protein-protein interaction (PPI) networks presents a challenging problem and has important implication in computational biology. Collective classification (CC) that utilizes both attribute features and relational information to jointly cla...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Qingyao, Wang, Zhenyu, Li, Chunshan, Ye, Yunming, Li, Yueping, Sun, Ning
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4331684/
https://www.ncbi.nlm.nih.gov/pubmed/25708164
http://dx.doi.org/10.1186/1752-0509-9-S1-S9
_version_ 1782357758113218560
author Wu, Qingyao
Wang, Zhenyu
Li, Chunshan
Ye, Yunming
Li, Yueping
Sun, Ning
author_facet Wu, Qingyao
Wang, Zhenyu
Li, Chunshan
Ye, Yunming
Li, Yueping
Sun, Ning
author_sort Wu, Qingyao
collection PubMed
description BACKGROUND: Predicting functional properties of proteins in protein-protein interaction (PPI) networks presents a challenging problem and has important implication in computational biology. Collective classification (CC) that utilizes both attribute features and relational information to jointly classify related proteins in PPI networks has been shown to be a powerful computational method for this problem setting. Enabling CC usually increases accuracy when given a fully-labeled PPI network with a large amount of labeled data. However, such labels can be difficult to obtain in many real-world PPI networks in which there are usually only a limited number of labeled proteins and there are a large amount of unlabeled proteins. In this case, most of the unlabeled proteins may not connected to the labeled ones, the supervision knowledge cannot be obtained effectively from local network connections. As a consequence, learning a CC model in sparsely-labeled PPI networks can lead to poor performance. RESULTS: We investigate a latent graph approach for finding an integration latent graph by exploiting various latent linkages and judiciously integrate the investigated linkages to link (separate) the proteins with similar (different) functions. We develop a regularized non-negative matrix factorization (RNMF) algorithm for CC to make protein functional properties prediction by utilizing various data sources that are available in this problem setting, including attribute features, latent graph, and unlabeled data information. In RNMF, a label matrix factorization term and a network regularization term are incorporated into the non-negative matrix factorization (NMF) objective function to seek a matrix factorization that respects the network structure and label information for classification prediction. CONCLUSION: Experimental results on KDD Cup tasks predicting the localization and functions of proteins to yeast genes demonstrate the effectiveness of the proposed RNMF method for predicting the protein properties. In the comparison, we find that the performance of the new method is better than those of the other compared CC algorithms especially in paucity of labeled proteins.
format Online
Article
Text
id pubmed-4331684
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43316842015-03-25 Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization Wu, Qingyao Wang, Zhenyu Li, Chunshan Ye, Yunming Li, Yueping Sun, Ning BMC Syst Biol Proceedings BACKGROUND: Predicting functional properties of proteins in protein-protein interaction (PPI) networks presents a challenging problem and has important implication in computational biology. Collective classification (CC) that utilizes both attribute features and relational information to jointly classify related proteins in PPI networks has been shown to be a powerful computational method for this problem setting. Enabling CC usually increases accuracy when given a fully-labeled PPI network with a large amount of labeled data. However, such labels can be difficult to obtain in many real-world PPI networks in which there are usually only a limited number of labeled proteins and there are a large amount of unlabeled proteins. In this case, most of the unlabeled proteins may not connected to the labeled ones, the supervision knowledge cannot be obtained effectively from local network connections. As a consequence, learning a CC model in sparsely-labeled PPI networks can lead to poor performance. RESULTS: We investigate a latent graph approach for finding an integration latent graph by exploiting various latent linkages and judiciously integrate the investigated linkages to link (separate) the proteins with similar (different) functions. We develop a regularized non-negative matrix factorization (RNMF) algorithm for CC to make protein functional properties prediction by utilizing various data sources that are available in this problem setting, including attribute features, latent graph, and unlabeled data information. In RNMF, a label matrix factorization term and a network regularization term are incorporated into the non-negative matrix factorization (NMF) objective function to seek a matrix factorization that respects the network structure and label information for classification prediction. CONCLUSION: Experimental results on KDD Cup tasks predicting the localization and functions of proteins to yeast genes demonstrate the effectiveness of the proposed RNMF method for predicting the protein properties. In the comparison, we find that the performance of the new method is better than those of the other compared CC algorithms especially in paucity of labeled proteins. BioMed Central 2015-01-21 /pmc/articles/PMC4331684/ /pubmed/25708164 http://dx.doi.org/10.1186/1752-0509-9-S1-S9 Text en Copyright © 2015 Wu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Wu, Qingyao
Wang, Zhenyu
Li, Chunshan
Ye, Yunming
Li, Yueping
Sun, Ning
Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization
title Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization
title_full Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization
title_fullStr Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization
title_full_unstemmed Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization
title_short Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization
title_sort protein functional properties prediction in sparsely-label ppi networks through regularized non-negative matrix factorization
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4331684/
https://www.ncbi.nlm.nih.gov/pubmed/25708164
http://dx.doi.org/10.1186/1752-0509-9-S1-S9
work_keys_str_mv AT wuqingyao proteinfunctionalpropertiespredictioninsparselylabelppinetworksthroughregularizednonnegativematrixfactorization
AT wangzhenyu proteinfunctionalpropertiespredictioninsparselylabelppinetworksthroughregularizednonnegativematrixfactorization
AT lichunshan proteinfunctionalpropertiespredictioninsparselylabelppinetworksthroughregularizednonnegativematrixfactorization
AT yeyunming proteinfunctionalpropertiespredictioninsparselylabelppinetworksthroughregularizednonnegativematrixfactorization
AT liyueping proteinfunctionalpropertiespredictioninsparselylabelppinetworksthroughregularizednonnegativematrixfactorization
AT sunning proteinfunctionalpropertiespredictioninsparselylabelppinetworksthroughregularizednonnegativematrixfactorization