Cargando…

Siamese Recurrent Neural Network with a Self-Attention Mechanism for Bioactivity Prediction

[Image: see text] Activity prediction plays an essential role in drug discovery by directing search of drug candidates in the relevant chemical space. Despite being applied successfully to image recognition and semantic similarity, the Siamese neural network has rarely been explored in drug discover...

Descripción completa

Detalles Bibliográficos
Autores principales: Fernández-Llaneza, Daniel, Ulander, Silas, Gogishvili, Dea, Nittinger, Eva, Zhao, Hongtao, Tyrchan, Christian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153912/
https://www.ncbi.nlm.nih.gov/pubmed/34056263
http://dx.doi.org/10.1021/acsomega.1c01266
_version_ 1783698901924052992
author Fernández-Llaneza, Daniel
Ulander, Silas
Gogishvili, Dea
Nittinger, Eva
Zhao, Hongtao
Tyrchan, Christian
author_facet Fernández-Llaneza, Daniel
Ulander, Silas
Gogishvili, Dea
Nittinger, Eva
Zhao, Hongtao
Tyrchan, Christian
author_sort Fernández-Llaneza, Daniel
collection PubMed
description [Image: see text] Activity prediction plays an essential role in drug discovery by directing search of drug candidates in the relevant chemical space. Despite being applied successfully to image recognition and semantic similarity, the Siamese neural network has rarely been explored in drug discovery where modelling faces challenges such as insufficient data and class imbalance. Here, we present a Siamese recurrent neural network model (SiameseCHEM) based on bidirectional long short-term memory architecture with a self-attention mechanism, which can automatically learn discriminative features from the SMILES representations of small molecules. Subsequently, it is used to categorize bioactivity of small molecules via N-shot learning. Trained on random SMILES strings, it proves robust across five different datasets for the task of binary or categorical classification of bioactivity. Benchmarking against two baseline machine learning models which use the chemistry-rich ECFP fingerprints as the input, the deep learning model outperforms on three datasets and achieves comparable performance on the other two. The failure of both baseline methods on SMILES strings highlights that the deep learning model may learn task-specific chemistry features encoded in SMILES strings.
format Online
Article
Text
id pubmed-8153912
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-81539122021-05-27 Siamese Recurrent Neural Network with a Self-Attention Mechanism for Bioactivity Prediction Fernández-Llaneza, Daniel Ulander, Silas Gogishvili, Dea Nittinger, Eva Zhao, Hongtao Tyrchan, Christian ACS Omega [Image: see text] Activity prediction plays an essential role in drug discovery by directing search of drug candidates in the relevant chemical space. Despite being applied successfully to image recognition and semantic similarity, the Siamese neural network has rarely been explored in drug discovery where modelling faces challenges such as insufficient data and class imbalance. Here, we present a Siamese recurrent neural network model (SiameseCHEM) based on bidirectional long short-term memory architecture with a self-attention mechanism, which can automatically learn discriminative features from the SMILES representations of small molecules. Subsequently, it is used to categorize bioactivity of small molecules via N-shot learning. Trained on random SMILES strings, it proves robust across five different datasets for the task of binary or categorical classification of bioactivity. Benchmarking against two baseline machine learning models which use the chemistry-rich ECFP fingerprints as the input, the deep learning model outperforms on three datasets and achieves comparable performance on the other two. The failure of both baseline methods on SMILES strings highlights that the deep learning model may learn task-specific chemistry features encoded in SMILES strings. American Chemical Society 2021-04-15 /pmc/articles/PMC8153912/ /pubmed/34056263 http://dx.doi.org/10.1021/acsomega.1c01266 Text en © 2021 The Authors. Published by American Chemical Society Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Fernández-Llaneza, Daniel
Ulander, Silas
Gogishvili, Dea
Nittinger, Eva
Zhao, Hongtao
Tyrchan, Christian
Siamese Recurrent Neural Network with a Self-Attention Mechanism for Bioactivity Prediction
title Siamese Recurrent Neural Network with a Self-Attention Mechanism for Bioactivity Prediction
title_full Siamese Recurrent Neural Network with a Self-Attention Mechanism for Bioactivity Prediction
title_fullStr Siamese Recurrent Neural Network with a Self-Attention Mechanism for Bioactivity Prediction
title_full_unstemmed Siamese Recurrent Neural Network with a Self-Attention Mechanism for Bioactivity Prediction
title_short Siamese Recurrent Neural Network with a Self-Attention Mechanism for Bioactivity Prediction
title_sort siamese recurrent neural network with a self-attention mechanism for bioactivity prediction
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153912/
https://www.ncbi.nlm.nih.gov/pubmed/34056263
http://dx.doi.org/10.1021/acsomega.1c01266
work_keys_str_mv AT fernandezllanezadaniel siameserecurrentneuralnetworkwithaselfattentionmechanismforbioactivityprediction
AT ulandersilas siameserecurrentneuralnetworkwithaselfattentionmechanismforbioactivityprediction
AT gogishvilidea siameserecurrentneuralnetworkwithaselfattentionmechanismforbioactivityprediction
AT nittingereva siameserecurrentneuralnetworkwithaselfattentionmechanismforbioactivityprediction
AT zhaohongtao siameserecurrentneuralnetworkwithaselfattentionmechanismforbioactivityprediction
AT tyrchanchristian siameserecurrentneuralnetworkwithaselfattentionmechanismforbioactivityprediction