Cargando…
Graph-BERT and language model-based framework for protein–protein interaction identification
Identification of protein–protein interactions (PPI) is among the critical problems in the domain of bioinformatics. Previous studies have utilized different AI-based models for PPI classification with advances in artificial intelligence (AI) techniques. The input to these models is the features ext...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10079975/ https://www.ncbi.nlm.nih.gov/pubmed/37024543 http://dx.doi.org/10.1038/s41598-023-31612-w |
_version_ | 1785020822495166464 |
---|---|
author | Jha, Kanchan Karmakar, Sourav Saha, Sriparna |
author_facet | Jha, Kanchan Karmakar, Sourav Saha, Sriparna |
author_sort | Jha, Kanchan |
collection | PubMed |
description | Identification of protein–protein interactions (PPI) is among the critical problems in the domain of bioinformatics. Previous studies have utilized different AI-based models for PPI classification with advances in artificial intelligence (AI) techniques. The input to these models is the features extracted from different sources of protein information, mainly sequence-derived features. In this work, we present an AI-based PPI identification model utilizing a PPI network and protein sequences. The PPI network is represented as a graph where each node is a protein pair, and an edge is defined between two nodes if there exists a common protein between these nodes. Each node in a graph has a feature vector. In this work, we have used the language model to extract feature vectors directly from protein sequences. The feature vectors for protein in pairs are concatenated and used as a node feature vector of a PPI network graph. Finally, we have used the Graph-BERT model to encode the PPI network graph with sequence-based features and learn the hidden representation of the feature vector for each node. The next step involves feeding the learned representations of nodes to the fully connected layer, the output of which is fed into the softmax layer to classify the protein interactions. To assess the efficacy of the proposed PPI model, we have performed experiments on several PPI datasets. The experimental results demonstrate that the proposed approach surpasses the existing PPI works and designed baselines in classifying PPI. |
format | Online Article Text |
id | pubmed-10079975 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-100799752023-04-08 Graph-BERT and language model-based framework for protein–protein interaction identification Jha, Kanchan Karmakar, Sourav Saha, Sriparna Sci Rep Article Identification of protein–protein interactions (PPI) is among the critical problems in the domain of bioinformatics. Previous studies have utilized different AI-based models for PPI classification with advances in artificial intelligence (AI) techniques. The input to these models is the features extracted from different sources of protein information, mainly sequence-derived features. In this work, we present an AI-based PPI identification model utilizing a PPI network and protein sequences. The PPI network is represented as a graph where each node is a protein pair, and an edge is defined between two nodes if there exists a common protein between these nodes. Each node in a graph has a feature vector. In this work, we have used the language model to extract feature vectors directly from protein sequences. The feature vectors for protein in pairs are concatenated and used as a node feature vector of a PPI network graph. Finally, we have used the Graph-BERT model to encode the PPI network graph with sequence-based features and learn the hidden representation of the feature vector for each node. The next step involves feeding the learned representations of nodes to the fully connected layer, the output of which is fed into the softmax layer to classify the protein interactions. To assess the efficacy of the proposed PPI model, we have performed experiments on several PPI datasets. The experimental results demonstrate that the proposed approach surpasses the existing PPI works and designed baselines in classifying PPI. Nature Publishing Group UK 2023-04-06 /pmc/articles/PMC10079975/ /pubmed/37024543 http://dx.doi.org/10.1038/s41598-023-31612-w Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Jha, Kanchan Karmakar, Sourav Saha, Sriparna Graph-BERT and language model-based framework for protein–protein interaction identification |
title | Graph-BERT and language model-based framework for protein–protein interaction identification |
title_full | Graph-BERT and language model-based framework for protein–protein interaction identification |
title_fullStr | Graph-BERT and language model-based framework for protein–protein interaction identification |
title_full_unstemmed | Graph-BERT and language model-based framework for protein–protein interaction identification |
title_short | Graph-BERT and language model-based framework for protein–protein interaction identification |
title_sort | graph-bert and language model-based framework for protein–protein interaction identification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10079975/ https://www.ncbi.nlm.nih.gov/pubmed/37024543 http://dx.doi.org/10.1038/s41598-023-31612-w |
work_keys_str_mv | AT jhakanchan graphbertandlanguagemodelbasedframeworkforproteinproteininteractionidentification AT karmakarsourav graphbertandlanguagemodelbasedframeworkforproteinproteininteractionidentification AT sahasriparna graphbertandlanguagemodelbasedframeworkforproteinproteininteractionidentification |