Cargando…

GCRNN: graph convolutional recurrent neural network for compound–protein interaction prediction

BACKGROUND: Compound–protein interaction prediction is necessary to investigate health regulatory functions and promotes drug discovery. Machine learning is becoming increasingly important in bioinformatics for applications such as analyzing protein-related data to achieve successful solutions. Mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Elbasani, Ermal, Njimbouom, Soualihou Ngnamsie, Oh, Tae-Jin, Kim, Eung-Hee, Lee, Hyun, Kim, Jeong-Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8753816/
https://www.ncbi.nlm.nih.gov/pubmed/35016607
http://dx.doi.org/10.1186/s12859-022-04560-x
_version_ 1784632148415741952
author Elbasani, Ermal
Njimbouom, Soualihou Ngnamsie
Oh, Tae-Jin
Kim, Eung-Hee
Lee, Hyun
Kim, Jeong-Dong
author_facet Elbasani, Ermal
Njimbouom, Soualihou Ngnamsie
Oh, Tae-Jin
Kim, Eung-Hee
Lee, Hyun
Kim, Jeong-Dong
author_sort Elbasani, Ermal
collection PubMed
description BACKGROUND: Compound–protein interaction prediction is necessary to investigate health regulatory functions and promotes drug discovery. Machine learning is becoming increasingly important in bioinformatics for applications such as analyzing protein-related data to achieve successful solutions. Modeling the properties and functions of proteins is important but challenging, especially when dealing with predictions of the sequence type. RESULT: We propose a method to model compounds and proteins for compound–protein interaction prediction. A graph neural network is used to represent the compounds, and a convolutional layer extended with a bidirectional recurrent neural network framework, Long Short-Term Memory, and Gate Recurrent unit is used for protein sequence vectorization. The convolutional layer captures regulatory protein functions, while the recurrent layer captures long-term dependencies between protein functions, thus improving the accuracy of interaction prediction with compounds. A database of 7000 sets of annotated compound protein interaction, containing 1000 base length proteins is taken into consideration for the implementation. The results indicate that the proposed model performs effectively and can yield satisfactory accuracy regarding compound protein interaction prediction. CONCLUSION: The performance of GCRNN is based on the classification accordiong to a binary class of interactions between proteins and compounds The architectural design of GCRNN model comes with the integration of the Bi-Recurrent layer on top of CNN to learn dependencies of motifs on protein sequences and improve the accuracy of the predictions.
format Online
Article
Text
id pubmed-8753816
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-87538162022-01-12 GCRNN: graph convolutional recurrent neural network for compound–protein interaction prediction Elbasani, Ermal Njimbouom, Soualihou Ngnamsie Oh, Tae-Jin Kim, Eung-Hee Lee, Hyun Kim, Jeong-Dong BMC Bioinformatics Research BACKGROUND: Compound–protein interaction prediction is necessary to investigate health regulatory functions and promotes drug discovery. Machine learning is becoming increasingly important in bioinformatics for applications such as analyzing protein-related data to achieve successful solutions. Modeling the properties and functions of proteins is important but challenging, especially when dealing with predictions of the sequence type. RESULT: We propose a method to model compounds and proteins for compound–protein interaction prediction. A graph neural network is used to represent the compounds, and a convolutional layer extended with a bidirectional recurrent neural network framework, Long Short-Term Memory, and Gate Recurrent unit is used for protein sequence vectorization. The convolutional layer captures regulatory protein functions, while the recurrent layer captures long-term dependencies between protein functions, thus improving the accuracy of interaction prediction with compounds. A database of 7000 sets of annotated compound protein interaction, containing 1000 base length proteins is taken into consideration for the implementation. The results indicate that the proposed model performs effectively and can yield satisfactory accuracy regarding compound protein interaction prediction. CONCLUSION: The performance of GCRNN is based on the classification accordiong to a binary class of interactions between proteins and compounds The architectural design of GCRNN model comes with the integration of the Bi-Recurrent layer on top of CNN to learn dependencies of motifs on protein sequences and improve the accuracy of the predictions. BioMed Central 2022-01-11 /pmc/articles/PMC8753816/ /pubmed/35016607 http://dx.doi.org/10.1186/s12859-022-04560-x Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Elbasani, Ermal
Njimbouom, Soualihou Ngnamsie
Oh, Tae-Jin
Kim, Eung-Hee
Lee, Hyun
Kim, Jeong-Dong
GCRNN: graph convolutional recurrent neural network for compound–protein interaction prediction
title GCRNN: graph convolutional recurrent neural network for compound–protein interaction prediction
title_full GCRNN: graph convolutional recurrent neural network for compound–protein interaction prediction
title_fullStr GCRNN: graph convolutional recurrent neural network for compound–protein interaction prediction
title_full_unstemmed GCRNN: graph convolutional recurrent neural network for compound–protein interaction prediction
title_short GCRNN: graph convolutional recurrent neural network for compound–protein interaction prediction
title_sort gcrnn: graph convolutional recurrent neural network for compound–protein interaction prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8753816/
https://www.ncbi.nlm.nih.gov/pubmed/35016607
http://dx.doi.org/10.1186/s12859-022-04560-x
work_keys_str_mv AT elbasaniermal gcrnngraphconvolutionalrecurrentneuralnetworkforcompoundproteininteractionprediction
AT njimbouomsoualihoungnamsie gcrnngraphconvolutionalrecurrentneuralnetworkforcompoundproteininteractionprediction
AT ohtaejin gcrnngraphconvolutionalrecurrentneuralnetworkforcompoundproteininteractionprediction
AT kimeunghee gcrnngraphconvolutionalrecurrentneuralnetworkforcompoundproteininteractionprediction
AT leehyun gcrnngraphconvolutionalrecurrentneuralnetworkforcompoundproteininteractionprediction
AT kimjeongdong gcrnngraphconvolutionalrecurrentneuralnetworkforcompoundproteininteractionprediction