Cargando…

SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks

Protein is the basic organic substance that constitutes the cell and is the material condition for the life activity and the guarantee of the biological function activity. Elucidating the interactions and functions of proteins is a central task in exploring the mysteries of life. As an important pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Ying, Wang, Lin-Lin, Wong, Leon, Li, Yang, Wang, Lei, You, Zhu-Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9313220/
https://www.ncbi.nlm.nih.gov/pubmed/35884848
http://dx.doi.org/10.3390/biomedicines10071543
_version_ 1784754026160586752
author Wang, Ying
Wang, Lin-Lin
Wong, Leon
Li, Yang
Wang, Lei
You, Zhu-Hong
author_facet Wang, Ying
Wang, Lin-Lin
Wong, Leon
Li, Yang
Wang, Lei
You, Zhu-Hong
author_sort Wang, Ying
collection PubMed
description Protein is the basic organic substance that constitutes the cell and is the material condition for the life activity and the guarantee of the biological function activity. Elucidating the interactions and functions of proteins is a central task in exploring the mysteries of life. As an important protein interaction, self-interacting protein (SIP) has a critical role. The fast growth of high-throughput experimental techniques among biomolecules has led to a massive influx of available SIP data. How to conduct scientific research using the massive amount of SIP data has become a new challenge that is being faced in related research fields such as biology and medicine. In this work, we design an SIP prediction method SIPGCN using a deep learning graph convolutional network (GCN) based on protein sequences. First, protein sequences are characterized using a position-specific scoring matrix, which is able to describe the biological evolutionary message, then their hidden features are extracted by the deep learning method GCN, and, finally, the random forest is utilized to predict whether there are interrelationships between proteins. In the cross-validation experiment, SIPGCN achieved 93.65% accuracy and 99.64% specificity in the human data set. SIPGCN achieved 90.69% and 99.08% of these two indicators in the yeast data set, respectively. Compared with other feature models and previous methods, SIPGCN showed excellent results. These outcomes suggest that SIPGCN may be a suitable instrument for predicting SIP and may be a reliable candidate for future wet experiments.
format Online
Article
Text
id pubmed-9313220
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-93132202022-07-26 SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks Wang, Ying Wang, Lin-Lin Wong, Leon Li, Yang Wang, Lei You, Zhu-Hong Biomedicines Article Protein is the basic organic substance that constitutes the cell and is the material condition for the life activity and the guarantee of the biological function activity. Elucidating the interactions and functions of proteins is a central task in exploring the mysteries of life. As an important protein interaction, self-interacting protein (SIP) has a critical role. The fast growth of high-throughput experimental techniques among biomolecules has led to a massive influx of available SIP data. How to conduct scientific research using the massive amount of SIP data has become a new challenge that is being faced in related research fields such as biology and medicine. In this work, we design an SIP prediction method SIPGCN using a deep learning graph convolutional network (GCN) based on protein sequences. First, protein sequences are characterized using a position-specific scoring matrix, which is able to describe the biological evolutionary message, then their hidden features are extracted by the deep learning method GCN, and, finally, the random forest is utilized to predict whether there are interrelationships between proteins. In the cross-validation experiment, SIPGCN achieved 93.65% accuracy and 99.64% specificity in the human data set. SIPGCN achieved 90.69% and 99.08% of these two indicators in the yeast data set, respectively. Compared with other feature models and previous methods, SIPGCN showed excellent results. These outcomes suggest that SIPGCN may be a suitable instrument for predicting SIP and may be a reliable candidate for future wet experiments. MDPI 2022-06-29 /pmc/articles/PMC9313220/ /pubmed/35884848 http://dx.doi.org/10.3390/biomedicines10071543 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Ying
Wang, Lin-Lin
Wong, Leon
Li, Yang
Wang, Lei
You, Zhu-Hong
SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks
title SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks
title_full SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks
title_fullStr SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks
title_full_unstemmed SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks
title_short SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks
title_sort sipgcn: a novel deep learning model for predicting self-interacting proteins from sequence information using graph convolutional networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9313220/
https://www.ncbi.nlm.nih.gov/pubmed/35884848
http://dx.doi.org/10.3390/biomedicines10071543
work_keys_str_mv AT wangying sipgcnanoveldeeplearningmodelforpredictingselfinteractingproteinsfromsequenceinformationusinggraphconvolutionalnetworks
AT wanglinlin sipgcnanoveldeeplearningmodelforpredictingselfinteractingproteinsfromsequenceinformationusinggraphconvolutionalnetworks
AT wongleon sipgcnanoveldeeplearningmodelforpredictingselfinteractingproteinsfromsequenceinformationusinggraphconvolutionalnetworks
AT liyang sipgcnanoveldeeplearningmodelforpredictingselfinteractingproteinsfromsequenceinformationusinggraphconvolutionalnetworks
AT wanglei sipgcnanoveldeeplearningmodelforpredictingselfinteractingproteinsfromsequenceinformationusinggraphconvolutionalnetworks
AT youzhuhong sipgcnanoveldeeplearningmodelforpredictingselfinteractingproteinsfromsequenceinformationusinggraphconvolutionalnetworks