Cargando…

Amalgamation of 3D structure and sequence information for protein–protein interaction prediction

Protein is the primary building block of living organisms. It interacts with other proteins and is then involved in various biological processes. Protein–protein interactions (PPIs) help in predicting and hence help in understanding the functionality of the proteins, causes and growth of diseases, a...

Descripción completa

Detalles Bibliográficos
Autores principales: Jha, Kanchan, Saha, Sriparna
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7645622/
https://www.ncbi.nlm.nih.gov/pubmed/33154416
http://dx.doi.org/10.1038/s41598-020-75467-x
_version_ 1783606666818748416
author Jha, Kanchan
Saha, Sriparna
author_facet Jha, Kanchan
Saha, Sriparna
author_sort Jha, Kanchan
collection PubMed
description Protein is the primary building block of living organisms. It interacts with other proteins and is then involved in various biological processes. Protein–protein interactions (PPIs) help in predicting and hence help in understanding the functionality of the proteins, causes and growth of diseases, and designing new drugs. However, there is a vast gap between the available protein sequences and the identification of protein–protein interactions. To bridge this gap, researchers proposed several computational methods to reveal the interactions between proteins. These methods merely depend on sequence-based information of proteins. With the advancement of technology, different types of information related to proteins are available such as 3D structure information. Nowadays, deep learning techniques are adopted successfully in various domains, including bioinformatics. So, current work focuses on the utilization of different modalities, such as 3D structures and sequence-based information of proteins, and deep learning algorithms to predict PPIs. The proposed approach is divided into several phases. We first get several illustrations of proteins using their 3D coordinates information, and three attributes, such as hydropathy index, isoelectric point, and charge of amino acids. Amino acids are the building blocks of proteins. A pre-trained ResNet50 model, a subclass of a convolutional neural network, is utilized to extract features from these representations of proteins. Autocovariance and conjoint triad are two widely used sequence-based methods to encode proteins, which are used here as another modality of protein sequences. A stacked autoencoder is utilized to get the compact form of sequence-based information. Finally, the features obtained from different modalities are concatenated in pairs and fed into the classifier to predict labels for protein pairs. We have experimented on the human PPIs dataset and Saccharomyces cerevisiae PPIs dataset and compared our results with the state-of-the-art deep-learning-based classifiers. The results achieved by the proposed method are superior to those obtained by the existing methods. Extensive experimentations on different datasets indicate that our approach to learning and combining features from two different modalities is useful in PPI prediction.
format Online
Article
Text
id pubmed-7645622
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-76456222020-11-06 Amalgamation of 3D structure and sequence information for protein–protein interaction prediction Jha, Kanchan Saha, Sriparna Sci Rep Article Protein is the primary building block of living organisms. It interacts with other proteins and is then involved in various biological processes. Protein–protein interactions (PPIs) help in predicting and hence help in understanding the functionality of the proteins, causes and growth of diseases, and designing new drugs. However, there is a vast gap between the available protein sequences and the identification of protein–protein interactions. To bridge this gap, researchers proposed several computational methods to reveal the interactions between proteins. These methods merely depend on sequence-based information of proteins. With the advancement of technology, different types of information related to proteins are available such as 3D structure information. Nowadays, deep learning techniques are adopted successfully in various domains, including bioinformatics. So, current work focuses on the utilization of different modalities, such as 3D structures and sequence-based information of proteins, and deep learning algorithms to predict PPIs. The proposed approach is divided into several phases. We first get several illustrations of proteins using their 3D coordinates information, and three attributes, such as hydropathy index, isoelectric point, and charge of amino acids. Amino acids are the building blocks of proteins. A pre-trained ResNet50 model, a subclass of a convolutional neural network, is utilized to extract features from these representations of proteins. Autocovariance and conjoint triad are two widely used sequence-based methods to encode proteins, which are used here as another modality of protein sequences. A stacked autoencoder is utilized to get the compact form of sequence-based information. Finally, the features obtained from different modalities are concatenated in pairs and fed into the classifier to predict labels for protein pairs. We have experimented on the human PPIs dataset and Saccharomyces cerevisiae PPIs dataset and compared our results with the state-of-the-art deep-learning-based classifiers. The results achieved by the proposed method are superior to those obtained by the existing methods. Extensive experimentations on different datasets indicate that our approach to learning and combining features from two different modalities is useful in PPI prediction. Nature Publishing Group UK 2020-11-05 /pmc/articles/PMC7645622/ /pubmed/33154416 http://dx.doi.org/10.1038/s41598-020-75467-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Jha, Kanchan
Saha, Sriparna
Amalgamation of 3D structure and sequence information for protein–protein interaction prediction
title Amalgamation of 3D structure and sequence information for protein–protein interaction prediction
title_full Amalgamation of 3D structure and sequence information for protein–protein interaction prediction
title_fullStr Amalgamation of 3D structure and sequence information for protein–protein interaction prediction
title_full_unstemmed Amalgamation of 3D structure and sequence information for protein–protein interaction prediction
title_short Amalgamation of 3D structure and sequence information for protein–protein interaction prediction
title_sort amalgamation of 3d structure and sequence information for protein–protein interaction prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7645622/
https://www.ncbi.nlm.nih.gov/pubmed/33154416
http://dx.doi.org/10.1038/s41598-020-75467-x
work_keys_str_mv AT jhakanchan amalgamationof3dstructureandsequenceinformationforproteinproteininteractionprediction
AT sahasriparna amalgamationof3dstructureandsequenceinformationforproteinproteininteractionprediction