Cargando…

Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer

MOTIVATION: Finding molecules with desired pharmaceutical properties is crucial in drug discovery. Generative models can be an efficient tool to find desired molecules through the distribution learned by the model to approximate given training data. Existing generative models (i) do not consider bac...

Descripción completa

Detalles Bibliográficos
Autores principales: Liao, Zhirui, Xie, Lei, Mamitsuka, Hiroshi, Zhu, Shanfeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9835482/
https://www.ncbi.nlm.nih.gov/pubmed/36576008
http://dx.doi.org/10.1093/bioinformatics/btac814
_version_ 1784868676043800576
author Liao, Zhirui
Xie, Lei
Mamitsuka, Hiroshi
Zhu, Shanfeng
author_facet Liao, Zhirui
Xie, Lei
Mamitsuka, Hiroshi
Zhu, Shanfeng
author_sort Liao, Zhirui
collection PubMed
description MOTIVATION: Finding molecules with desired pharmaceutical properties is crucial in drug discovery. Generative models can be an efficient tool to find desired molecules through the distribution learned by the model to approximate given training data. Existing generative models (i) do not consider backbone structures (scaffolds), resulting in inefficiency or (ii) need prior patterns for scaffolds, causing bias. Scaffolds are reasonable to use, and it is imperative to design a generative model without any prior scaffold patterns. RESULTS: We propose a generative model-based molecule generator, Sc2Mol, without any prior scaffold patterns. Sc2Mol uses SMILES strings for molecules. It consists of two steps: scaffold generation and scaffold decoration, which are carried out by a variational autoencoder and a transformer, respectively. The two steps are powerful for implementing random molecule generation and scaffold optimization. Our empirical evaluation using drug-like molecule datasets confirmed the success of our model in distribution learning and molecule optimization. Also, our model could automatically learn the rules to transform coarse scaffolds into sophisticated drug candidates. These rules were consistent with those for current lead optimization. AVAILABILITY AND IMPLEMENTATION: The code is available at https://github.com/zhiruiliao/Sc2Mol. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9835482
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98354822023-01-17 Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer Liao, Zhirui Xie, Lei Mamitsuka, Hiroshi Zhu, Shanfeng Bioinformatics Original Paper MOTIVATION: Finding molecules with desired pharmaceutical properties is crucial in drug discovery. Generative models can be an efficient tool to find desired molecules through the distribution learned by the model to approximate given training data. Existing generative models (i) do not consider backbone structures (scaffolds), resulting in inefficiency or (ii) need prior patterns for scaffolds, causing bias. Scaffolds are reasonable to use, and it is imperative to design a generative model without any prior scaffold patterns. RESULTS: We propose a generative model-based molecule generator, Sc2Mol, without any prior scaffold patterns. Sc2Mol uses SMILES strings for molecules. It consists of two steps: scaffold generation and scaffold decoration, which are carried out by a variational autoencoder and a transformer, respectively. The two steps are powerful for implementing random molecule generation and scaffold optimization. Our empirical evaluation using drug-like molecule datasets confirmed the success of our model in distribution learning and molecule optimization. Also, our model could automatically learn the rules to transform coarse scaffolds into sophisticated drug candidates. These rules were consistent with those for current lead optimization. AVAILABILITY AND IMPLEMENTATION: The code is available at https://github.com/zhiruiliao/Sc2Mol. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-12-28 /pmc/articles/PMC9835482/ /pubmed/36576008 http://dx.doi.org/10.1093/bioinformatics/btac814 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Liao, Zhirui
Xie, Lei
Mamitsuka, Hiroshi
Zhu, Shanfeng
Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer
title Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer
title_full Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer
title_fullStr Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer
title_full_unstemmed Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer
title_short Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer
title_sort sc2mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9835482/
https://www.ncbi.nlm.nih.gov/pubmed/36576008
http://dx.doi.org/10.1093/bioinformatics/btac814
work_keys_str_mv AT liaozhirui sc2molascaffoldbasedtwostepmoleculegeneratorwithvariationalautoencoderandtransformer
AT xielei sc2molascaffoldbasedtwostepmoleculegeneratorwithvariationalautoencoderandtransformer
AT mamitsukahiroshi sc2molascaffoldbasedtwostepmoleculegeneratorwithvariationalautoencoderandtransformer
AT zhushanfeng sc2molascaffoldbasedtwostepmoleculegeneratorwithvariationalautoencoderandtransformer