Cargando…
Predicting protein-protein interactions using high-quality non-interacting pairs
BACKGROUND: Identifying protein-protein interactions (PPIs) is of paramount importance for understanding cellular processes. Machine learning-based approaches have been developed to predict PPIs, but the effectiveness of these approaches is unsatisfactory. One major reason is that they randomly choo...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311908/ https://www.ncbi.nlm.nih.gov/pubmed/30598096 http://dx.doi.org/10.1186/s12859-018-2525-3 |
_version_ | 1783383698752667648 |
---|---|
author | Zhang, Long Yu, Guoxian Guo, Maozu Wang, Jun |
author_facet | Zhang, Long Yu, Guoxian Guo, Maozu Wang, Jun |
author_sort | Zhang, Long |
collection | PubMed |
description | BACKGROUND: Identifying protein-protein interactions (PPIs) is of paramount importance for understanding cellular processes. Machine learning-based approaches have been developed to predict PPIs, but the effectiveness of these approaches is unsatisfactory. One major reason is that they randomly choose non-interacting protein pairs (negative samples) or heuristically select non-interacting pairs with low quality. RESULTS: To boost the effectiveness of predicting PPIs, we propose two novel approaches (NIP-SS and NIP-RW) to generate high quality non-interacting pairs based on sequence similarity and random walk, respectively. Specifically, the known PPIs collected from public databases are used to generate the positive samples. NIP-SS then selects the top-m dissimilar protein pairs as negative examples and controls the degree distribution of selected proteins to construct the negative dataset. NIP-RW performs random walk on the PPI network to update the adjacency matrix of the network, and then selects protein pairs not connected in the updated network as negative samples. Next, we use auto covariance (AC) descriptor to encode the feature information of amino acid sequences. After that, we employ deep neural networks (DNNs) to predict PPIs based on extracted features, positive and negative examples. Extensive experiments show that NIP-SS and NIP-RW can generate negative samples with higher quality than existing strategies and thus enable more accurate prediction. CONCLUSIONS: The experimental results prove that negative datasets constructed by NIP-SS and NIP-RW can reduce the bias and have good generalization ability. NIP-SS and NIP-RW can be used as a plugin to boost the effectiveness of PPIs prediction. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NIP. |
format | Online Article Text |
id | pubmed-6311908 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63119082019-01-07 Predicting protein-protein interactions using high-quality non-interacting pairs Zhang, Long Yu, Guoxian Guo, Maozu Wang, Jun BMC Bioinformatics Research BACKGROUND: Identifying protein-protein interactions (PPIs) is of paramount importance for understanding cellular processes. Machine learning-based approaches have been developed to predict PPIs, but the effectiveness of these approaches is unsatisfactory. One major reason is that they randomly choose non-interacting protein pairs (negative samples) or heuristically select non-interacting pairs with low quality. RESULTS: To boost the effectiveness of predicting PPIs, we propose two novel approaches (NIP-SS and NIP-RW) to generate high quality non-interacting pairs based on sequence similarity and random walk, respectively. Specifically, the known PPIs collected from public databases are used to generate the positive samples. NIP-SS then selects the top-m dissimilar protein pairs as negative examples and controls the degree distribution of selected proteins to construct the negative dataset. NIP-RW performs random walk on the PPI network to update the adjacency matrix of the network, and then selects protein pairs not connected in the updated network as negative samples. Next, we use auto covariance (AC) descriptor to encode the feature information of amino acid sequences. After that, we employ deep neural networks (DNNs) to predict PPIs based on extracted features, positive and negative examples. Extensive experiments show that NIP-SS and NIP-RW can generate negative samples with higher quality than existing strategies and thus enable more accurate prediction. CONCLUSIONS: The experimental results prove that negative datasets constructed by NIP-SS and NIP-RW can reduce the bias and have good generalization ability. NIP-SS and NIP-RW can be used as a plugin to boost the effectiveness of PPIs prediction. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NIP. BioMed Central 2018-12-31 /pmc/articles/PMC6311908/ /pubmed/30598096 http://dx.doi.org/10.1186/s12859-018-2525-3 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Zhang, Long Yu, Guoxian Guo, Maozu Wang, Jun Predicting protein-protein interactions using high-quality non-interacting pairs |
title | Predicting protein-protein interactions using high-quality non-interacting pairs |
title_full | Predicting protein-protein interactions using high-quality non-interacting pairs |
title_fullStr | Predicting protein-protein interactions using high-quality non-interacting pairs |
title_full_unstemmed | Predicting protein-protein interactions using high-quality non-interacting pairs |
title_short | Predicting protein-protein interactions using high-quality non-interacting pairs |
title_sort | predicting protein-protein interactions using high-quality non-interacting pairs |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311908/ https://www.ncbi.nlm.nih.gov/pubmed/30598096 http://dx.doi.org/10.1186/s12859-018-2525-3 |
work_keys_str_mv | AT zhanglong predictingproteinproteininteractionsusinghighqualitynoninteractingpairs AT yuguoxian predictingproteinproteininteractionsusinghighqualitynoninteractingpairs AT guomaozu predictingproteinproteininteractionsusinghighqualitynoninteractingpairs AT wangjun predictingproteinproteininteractionsusinghighqualitynoninteractingpairs |