Cargando…

DCSE:Double-Channel-Siamese-Ensemble model for protein protein interaction prediction

BACKGROUND: Protein-protein interaction (PPI) is very important for many biochemical processes. Therefore, accurate prediction of PPI can help us better understand the role of proteins in biochemical processes. Although there are many methods to predict PPI in biology, they are time-consuming and la...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Wenqi, Wang, Shuang, Song, Tao, Li, Xue, Han, Peifu, Gao, Changnan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9351149/
https://www.ncbi.nlm.nih.gov/pubmed/35922751
http://dx.doi.org/10.1186/s12864-022-08772-6
_version_ 1784762378431234048
author Chen, Wenqi
Wang, Shuang
Song, Tao
Li, Xue
Han, Peifu
Gao, Changnan
author_facet Chen, Wenqi
Wang, Shuang
Song, Tao
Li, Xue
Han, Peifu
Gao, Changnan
author_sort Chen, Wenqi
collection PubMed
description BACKGROUND: Protein-protein interaction (PPI) is very important for many biochemical processes. Therefore, accurate prediction of PPI can help us better understand the role of proteins in biochemical processes. Although there are many methods to predict PPI in biology, they are time-consuming and lack accuracy, so it is necessary to build an efficiently and accurately computational model in the field of PPI prediction. RESULTS: We present a novel sequence-based computational approach called DCSE (Double-Channel-Siamese-Ensemble) to predict potential PPI. In the encoding layer, we treat each amino acid as a word, and map it into an N-dimensional vector. In the feature extraction layer, we extract features from local and global perspectives by Multilayer Convolutional Neural Network (MCN) and Multilayer Bidirectional Gated Recurrent Unit with Convolutional Neural Networks (MBC). Finally, the output of the feature extraction layer is then fed into the prediction layer to output whether the input protein pair will interact each other. The MCN and MBC are siamese and ensemble based network, which can effectively improve the performance of the model. In order to demonstrate our model’s performance, we compare it with four machine learning based and three deep learning based models. The results show that our method outperforms other models in all evaluation criteria. The Accuracy, Precision, [Formula: see text] , Recall and MCC of our model are 0.9303, 0.9091, 0.9268, 0.9452, 0.8609. For the other seven models, the highest Accuracy, Precision, [Formula: see text] , Recall and MCC are 0.9288, 0.9243, 0.9246, 0.9250, 0.8572. We also test our model in the imbalanced dataset and transfer our model to another species. The results show our model is excellent. CONCLUSION: Our model achieves the best performance by comparing it with seven other models. NLP-based coding method has a good effect on PPI prediction task. MCN and MBC extract protein sequence features from local and global perspectives and these two feature extraction layers are based on siamese and ensemble network structures. Siamese-based network structure can keep the features consistent and ensemble based network structure can effectively improve the accuracy of the model. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08772-6.
format Online
Article
Text
id pubmed-9351149
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-93511492022-08-05 DCSE:Double-Channel-Siamese-Ensemble model for protein protein interaction prediction Chen, Wenqi Wang, Shuang Song, Tao Li, Xue Han, Peifu Gao, Changnan BMC Genomics Research BACKGROUND: Protein-protein interaction (PPI) is very important for many biochemical processes. Therefore, accurate prediction of PPI can help us better understand the role of proteins in biochemical processes. Although there are many methods to predict PPI in biology, they are time-consuming and lack accuracy, so it is necessary to build an efficiently and accurately computational model in the field of PPI prediction. RESULTS: We present a novel sequence-based computational approach called DCSE (Double-Channel-Siamese-Ensemble) to predict potential PPI. In the encoding layer, we treat each amino acid as a word, and map it into an N-dimensional vector. In the feature extraction layer, we extract features from local and global perspectives by Multilayer Convolutional Neural Network (MCN) and Multilayer Bidirectional Gated Recurrent Unit with Convolutional Neural Networks (MBC). Finally, the output of the feature extraction layer is then fed into the prediction layer to output whether the input protein pair will interact each other. The MCN and MBC are siamese and ensemble based network, which can effectively improve the performance of the model. In order to demonstrate our model’s performance, we compare it with four machine learning based and three deep learning based models. The results show that our method outperforms other models in all evaluation criteria. The Accuracy, Precision, [Formula: see text] , Recall and MCC of our model are 0.9303, 0.9091, 0.9268, 0.9452, 0.8609. For the other seven models, the highest Accuracy, Precision, [Formula: see text] , Recall and MCC are 0.9288, 0.9243, 0.9246, 0.9250, 0.8572. We also test our model in the imbalanced dataset and transfer our model to another species. The results show our model is excellent. CONCLUSION: Our model achieves the best performance by comparing it with seven other models. NLP-based coding method has a good effect on PPI prediction task. MCN and MBC extract protein sequence features from local and global perspectives and these two feature extraction layers are based on siamese and ensemble network structures. Siamese-based network structure can keep the features consistent and ensemble based network structure can effectively improve the accuracy of the model. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08772-6. BioMed Central 2022-08-04 /pmc/articles/PMC9351149/ /pubmed/35922751 http://dx.doi.org/10.1186/s12864-022-08772-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Chen, Wenqi
Wang, Shuang
Song, Tao
Li, Xue
Han, Peifu
Gao, Changnan
DCSE:Double-Channel-Siamese-Ensemble model for protein protein interaction prediction
title DCSE:Double-Channel-Siamese-Ensemble model for protein protein interaction prediction
title_full DCSE:Double-Channel-Siamese-Ensemble model for protein protein interaction prediction
title_fullStr DCSE:Double-Channel-Siamese-Ensemble model for protein protein interaction prediction
title_full_unstemmed DCSE:Double-Channel-Siamese-Ensemble model for protein protein interaction prediction
title_short DCSE:Double-Channel-Siamese-Ensemble model for protein protein interaction prediction
title_sort dcse:double-channel-siamese-ensemble model for protein protein interaction prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9351149/
https://www.ncbi.nlm.nih.gov/pubmed/35922751
http://dx.doi.org/10.1186/s12864-022-08772-6
work_keys_str_mv AT chenwenqi dcsedoublechannelsiameseensemblemodelforproteinproteininteractionprediction
AT wangshuang dcsedoublechannelsiameseensemblemodelforproteinproteininteractionprediction
AT songtao dcsedoublechannelsiameseensemblemodelforproteinproteininteractionprediction
AT lixue dcsedoublechannelsiameseensemblemodelforproteinproteininteractionprediction
AT hanpeifu dcsedoublechannelsiameseensemblemodelforproteinproteininteractionprediction
AT gaochangnan dcsedoublechannelsiameseensemblemodelforproteinproteininteractionprediction