Cargando…
ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding
MOTIVATION: Drug-target binding affinity (DTA) reflects the strength of the drug-target interaction; therefore, predicting the DTA can considerably benefit drug discovery by narrowing the search space and pruning drug-target (DT) pairs with low binding affinity scores. Representation learning using...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922401/ https://www.ncbi.nlm.nih.gov/pubmed/35292100 http://dx.doi.org/10.1186/s13321-022-00591-x |
_version_ | 1784669514613391360 |
---|---|
author | Wang, Junjie Wen, NaiFeng Wang, Chunyu Zhao, Lingling Cheng, Liang |
author_facet | Wang, Junjie Wen, NaiFeng Wang, Chunyu Zhao, Lingling Cheng, Liang |
author_sort | Wang, Junjie |
collection | PubMed |
description | MOTIVATION: Drug-target binding affinity (DTA) reflects the strength of the drug-target interaction; therefore, predicting the DTA can considerably benefit drug discovery by narrowing the search space and pruning drug-target (DT) pairs with low binding affinity scores. Representation learning using deep neural networks has achieved promising performance compared with traditional machine learning methods; hence, extensive research efforts have been made in learning the feature representation of proteins and compounds. However, such feature representation learning relies on a large-scale labelled dataset, which is not always available. RESULTS: We present an end-to-end deep learning framework, ELECTRA-DTA, to predict the binding affinity of drug-target pairs. This framework incorporates an unsupervised learning mechanism to train two ELECTRA-based contextual embedding models, one for protein amino acids and the other for compound SMILES string encoding. In addition, ELECTRA-DTA leverages a squeeze-and-excitation (SE) convolutional neural network block stacked over three fully connected layers to further capture the sequential and spatial features of the protein sequence and SMILES for the DTA regression task. Experimental evaluations show that ELECTRA-DTA outperforms various state-of-the-art DTA prediction models, especially with the challenging, interaction-sparse BindingDB dataset. In target selection and drug repurposing for COVID-19, ELECTRA-DTA also offers competitive performance, suggesting its potential in speeding drug discovery and generalizability for other compound- or protein-related computational tasks. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-022-00591-x. |
format | Online Article Text |
id | pubmed-8922401 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-89224012022-03-15 ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding Wang, Junjie Wen, NaiFeng Wang, Chunyu Zhao, Lingling Cheng, Liang J Cheminform Research Article MOTIVATION: Drug-target binding affinity (DTA) reflects the strength of the drug-target interaction; therefore, predicting the DTA can considerably benefit drug discovery by narrowing the search space and pruning drug-target (DT) pairs with low binding affinity scores. Representation learning using deep neural networks has achieved promising performance compared with traditional machine learning methods; hence, extensive research efforts have been made in learning the feature representation of proteins and compounds. However, such feature representation learning relies on a large-scale labelled dataset, which is not always available. RESULTS: We present an end-to-end deep learning framework, ELECTRA-DTA, to predict the binding affinity of drug-target pairs. This framework incorporates an unsupervised learning mechanism to train two ELECTRA-based contextual embedding models, one for protein amino acids and the other for compound SMILES string encoding. In addition, ELECTRA-DTA leverages a squeeze-and-excitation (SE) convolutional neural network block stacked over three fully connected layers to further capture the sequential and spatial features of the protein sequence and SMILES for the DTA regression task. Experimental evaluations show that ELECTRA-DTA outperforms various state-of-the-art DTA prediction models, especially with the challenging, interaction-sparse BindingDB dataset. In target selection and drug repurposing for COVID-19, ELECTRA-DTA also offers competitive performance, suggesting its potential in speeding drug discovery and generalizability for other compound- or protein-related computational tasks. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-022-00591-x. Springer International Publishing 2022-03-15 /pmc/articles/PMC8922401/ /pubmed/35292100 http://dx.doi.org/10.1186/s13321-022-00591-x Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Wang, Junjie Wen, NaiFeng Wang, Chunyu Zhao, Lingling Cheng, Liang ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding |
title | ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding |
title_full | ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding |
title_fullStr | ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding |
title_full_unstemmed | ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding |
title_short | ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding |
title_sort | electra-dta: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922401/ https://www.ncbi.nlm.nih.gov/pubmed/35292100 http://dx.doi.org/10.1186/s13321-022-00591-x |
work_keys_str_mv | AT wangjunjie electradtaanewcompoundproteinbindingaffinitypredictionmodelbasedonthecontextualizedsequenceencoding AT wennaifeng electradtaanewcompoundproteinbindingaffinitypredictionmodelbasedonthecontextualizedsequenceencoding AT wangchunyu electradtaanewcompoundproteinbindingaffinitypredictionmodelbasedonthecontextualizedsequenceencoding AT zhaolingling electradtaanewcompoundproteinbindingaffinitypredictionmodelbasedonthecontextualizedsequenceencoding AT chengliang electradtaanewcompoundproteinbindingaffinitypredictionmodelbasedonthecontextualizedsequenceencoding |