Cargando…

Variational autoencoder-based chemical latent space for large molecular structures with 3D complexity

The structural diversity of chemical libraries, which are systematic collections of compounds that have potential to bind to biomolecules, can be represented by chemical latent space. A chemical latent space is a projection of a compound structure into a mathematical space based on several molecular...

Descripción completa

Detalles Bibliográficos
Autores principales: Ochiai, Toshiki, Inukai, Tensei, Akiyama, Manato, Furui, Kairi, Ohue, Masahito, Matsumori, Nobuaki, Inuki, Shinsuke, Uesugi, Motonari, Sunazuka, Toshiaki, Kikuchi, Kazuya, Kakeya, Hideaki, Sakakibara, Yasubumi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654724/
https://www.ncbi.nlm.nih.gov/pubmed/37973971
http://dx.doi.org/10.1038/s42004-023-01054-6
Descripción
Sumario:The structural diversity of chemical libraries, which are systematic collections of compounds that have potential to bind to biomolecules, can be represented by chemical latent space. A chemical latent space is a projection of a compound structure into a mathematical space based on several molecular features, and it can express structural diversity within a compound library in order to explore a broader chemical space and generate novel compound structures for drug candidates. In this study, we developed a deep-learning method, called NP-VAE (Natural Product-oriented Variational Autoencoder), based on variational autoencoder for managing hard-to-analyze datasets from DrugBank and large molecular structures such as natural compounds with chirality, an essential factor in the 3D complexity of compounds. NP-VAE was successful in constructing the chemical latent space from large-sized compounds that were unable to be handled in existing methods, achieving higher reconstruction accuracy, and demonstrating stable performance as a generative model across various indices. Furthermore, by exploring the acquired latent space, we succeeded in comprehensively analyzing a compound library containing natural compounds and generating novel compound structures with optimized functions.