Cargando…

A merged molecular representation learning for molecular properties prediction with a web-based service

Deep learning has brought a dramatic development in molecular property prediction that is crucial in the field of drug discovery using various representations such as fingerprints, SMILES, and graphs. In particular, SMILES is used in various deep learning models via character-based approaches. Howev...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Hyunseob, Lee, Jeongcheol, Ahn, Sunil, Lee, Jongsuk Ruth
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8155205/
https://www.ncbi.nlm.nih.gov/pubmed/34040026
http://dx.doi.org/10.1038/s41598-021-90259-7
_version_ 1783699155785351168
author Kim, Hyunseob
Lee, Jeongcheol
Ahn, Sunil
Lee, Jongsuk Ruth
author_facet Kim, Hyunseob
Lee, Jeongcheol
Ahn, Sunil
Lee, Jongsuk Ruth
author_sort Kim, Hyunseob
collection PubMed
description Deep learning has brought a dramatic development in molecular property prediction that is crucial in the field of drug discovery using various representations such as fingerprints, SMILES, and graphs. In particular, SMILES is used in various deep learning models via character-based approaches. However, SMILES has a limitation in that it is hard to reflect chemical properties. In this paper, we propose a new self-supervised method to learn SMILES and chemical contexts of molecules simultaneously in pre-training the Transformer. The key of our model is learning structures with adjacency matrix embedding and learning logics that can infer descriptors via Quantitative Estimation of Drug-likeness prediction in pre-training. As a result, our method improves the generalization of the data and achieves the best average performance by benchmarking downstream tasks. Moreover, we develop a web-based fine-tuning service to utilize our model on various tasks.
format Online
Article
Text
id pubmed-8155205
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-81552052021-05-28 A merged molecular representation learning for molecular properties prediction with a web-based service Kim, Hyunseob Lee, Jeongcheol Ahn, Sunil Lee, Jongsuk Ruth Sci Rep Article Deep learning has brought a dramatic development in molecular property prediction that is crucial in the field of drug discovery using various representations such as fingerprints, SMILES, and graphs. In particular, SMILES is used in various deep learning models via character-based approaches. However, SMILES has a limitation in that it is hard to reflect chemical properties. In this paper, we propose a new self-supervised method to learn SMILES and chemical contexts of molecules simultaneously in pre-training the Transformer. The key of our model is learning structures with adjacency matrix embedding and learning logics that can infer descriptors via Quantitative Estimation of Drug-likeness prediction in pre-training. As a result, our method improves the generalization of the data and achieves the best average performance by benchmarking downstream tasks. Moreover, we develop a web-based fine-tuning service to utilize our model on various tasks. Nature Publishing Group UK 2021-05-26 /pmc/articles/PMC8155205/ /pubmed/34040026 http://dx.doi.org/10.1038/s41598-021-90259-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Kim, Hyunseob
Lee, Jeongcheol
Ahn, Sunil
Lee, Jongsuk Ruth
A merged molecular representation learning for molecular properties prediction with a web-based service
title A merged molecular representation learning for molecular properties prediction with a web-based service
title_full A merged molecular representation learning for molecular properties prediction with a web-based service
title_fullStr A merged molecular representation learning for molecular properties prediction with a web-based service
title_full_unstemmed A merged molecular representation learning for molecular properties prediction with a web-based service
title_short A merged molecular representation learning for molecular properties prediction with a web-based service
title_sort merged molecular representation learning for molecular properties prediction with a web-based service
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8155205/
https://www.ncbi.nlm.nih.gov/pubmed/34040026
http://dx.doi.org/10.1038/s41598-021-90259-7
work_keys_str_mv AT kimhyunseob amergedmolecularrepresentationlearningformolecularpropertiespredictionwithawebbasedservice
AT leejeongcheol amergedmolecularrepresentationlearningformolecularpropertiespredictionwithawebbasedservice
AT ahnsunil amergedmolecularrepresentationlearningformolecularpropertiespredictionwithawebbasedservice
AT leejongsukruth amergedmolecularrepresentationlearningformolecularpropertiespredictionwithawebbasedservice
AT kimhyunseob mergedmolecularrepresentationlearningformolecularpropertiespredictionwithawebbasedservice
AT leejeongcheol mergedmolecularrepresentationlearningformolecularpropertiespredictionwithawebbasedservice
AT ahnsunil mergedmolecularrepresentationlearningformolecularpropertiespredictionwithawebbasedservice
AT leejongsukruth mergedmolecularrepresentationlearningformolecularpropertiespredictionwithawebbasedservice