Cargando…

Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks

Predicting both accurate and reliable solubility values has long been a crucial but challenging task. In this work, surrogated model-based methods were developed to accurately predict the solubility of two molecules (solute and solvent) through machine learning and deep learning. The current study e...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Sumin, Lee, Myeonghun, Gyak, Ki-Won, Kim, Sung Dug, Kim, Mi-Jeong, Min, Kyoungmin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9016862/
https://www.ncbi.nlm.nih.gov/pubmed/35449985
http://dx.doi.org/10.1021/acsomega.2c00697
_version_ 1784688637964713984
author Lee, Sumin
Lee, Myeonghun
Gyak, Ki-Won
Kim, Sung Dug
Kim, Mi-Jeong
Min, Kyoungmin
author_facet Lee, Sumin
Lee, Myeonghun
Gyak, Ki-Won
Kim, Sung Dug
Kim, Mi-Jeong
Min, Kyoungmin
author_sort Lee, Sumin
collection PubMed
description Predicting both accurate and reliable solubility values has long been a crucial but challenging task. In this work, surrogated model-based methods were developed to accurately predict the solubility of two molecules (solute and solvent) through machine learning and deep learning. The current study employed two methods: (1) converting molecules into molecular fingerprints and adding optimal physicochemical properties as descriptors and (2) using graph convolutional network (GCN) models to convert molecules into a graph representation and deal with prediction tasks. Then, two prediction tasks were conducted with each method: (1) the solubility value (regression) and (2) the solubility class (classification). The fingerprint-based method clearly demonstrates that high performance is possible by adding simple but significant physicochemical descriptors to molecular fingerprints, while the GCN method shows that it is possible to predict various properties of chemical compounds with relatively simplified features from the graph representation. The developed methodologies provide a comprehensive understanding of constructing a proper model for predicting solubility and can be employed to find suitable solutes and solvents.
format Online
Article
Text
id pubmed-9016862
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-90168622022-04-20 Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks Lee, Sumin Lee, Myeonghun Gyak, Ki-Won Kim, Sung Dug Kim, Mi-Jeong Min, Kyoungmin ACS Omega Predicting both accurate and reliable solubility values has long been a crucial but challenging task. In this work, surrogated model-based methods were developed to accurately predict the solubility of two molecules (solute and solvent) through machine learning and deep learning. The current study employed two methods: (1) converting molecules into molecular fingerprints and adding optimal physicochemical properties as descriptors and (2) using graph convolutional network (GCN) models to convert molecules into a graph representation and deal with prediction tasks. Then, two prediction tasks were conducted with each method: (1) the solubility value (regression) and (2) the solubility class (classification). The fingerprint-based method clearly demonstrates that high performance is possible by adding simple but significant physicochemical descriptors to molecular fingerprints, while the GCN method shows that it is possible to predict various properties of chemical compounds with relatively simplified features from the graph representation. The developed methodologies provide a comprehensive understanding of constructing a proper model for predicting solubility and can be employed to find suitable solutes and solvents. American Chemical Society 2022-04-04 /pmc/articles/PMC9016862/ /pubmed/35449985 http://dx.doi.org/10.1021/acsomega.2c00697 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Lee, Sumin
Lee, Myeonghun
Gyak, Ki-Won
Kim, Sung Dug
Kim, Mi-Jeong
Min, Kyoungmin
Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks
title Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks
title_full Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks
title_fullStr Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks
title_full_unstemmed Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks
title_short Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks
title_sort novel solubility prediction models: molecular fingerprints and physicochemical features vs graph convolutional neural networks
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9016862/
https://www.ncbi.nlm.nih.gov/pubmed/35449985
http://dx.doi.org/10.1021/acsomega.2c00697
work_keys_str_mv AT leesumin novelsolubilitypredictionmodelsmolecularfingerprintsandphysicochemicalfeaturesvsgraphconvolutionalneuralnetworks
AT leemyeonghun novelsolubilitypredictionmodelsmolecularfingerprintsandphysicochemicalfeaturesvsgraphconvolutionalneuralnetworks
AT gyakkiwon novelsolubilitypredictionmodelsmolecularfingerprintsandphysicochemicalfeaturesvsgraphconvolutionalneuralnetworks
AT kimsungdug novelsolubilitypredictionmodelsmolecularfingerprintsandphysicochemicalfeaturesvsgraphconvolutionalneuralnetworks
AT kimmijeong novelsolubilitypredictionmodelsmolecularfingerprintsandphysicochemicalfeaturesvsgraphconvolutionalneuralnetworks
AT minkyoungmin novelsolubilitypredictionmodelsmolecularfingerprintsandphysicochemicalfeaturesvsgraphconvolutionalneuralnetworks