Cargando…

Enhancing the Search Performance of Bayesian Optimization by Creating Different Descriptor Datasets Using Density Functional Theory

[Image: see text] Descriptors calculated from molecular structure information can be used as explanatory variables in Bayesian optimization (BO). Even though structural and descriptor information can be obtained from various databases for general compounds, information on highly confidential compoun...

Descripción completa

Detalles Bibliográficos
Autores principales: Morishita, Toshiharu, Kaneko, Hiromasa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500684/
https://www.ncbi.nlm.nih.gov/pubmed/37720759
http://dx.doi.org/10.1021/acsomega.3c04891
_version_ 1785105961674866688
author Morishita, Toshiharu
Kaneko, Hiromasa
author_facet Morishita, Toshiharu
Kaneko, Hiromasa
author_sort Morishita, Toshiharu
collection PubMed
description [Image: see text] Descriptors calculated from molecular structure information can be used as explanatory variables in Bayesian optimization (BO). Even though structural and descriptor information can be obtained from various databases for general compounds, information on highly confidential compounds such as pharmaceutical intermediates and active pharmaceutical ingredients cannot be retrieved from these databases. In particular, determining the stable structure and electronic state of a compound via quantum chemical calculations from descriptor information requires considerable computational time. Although descriptor information can be obtained using density functional theory (DFT), which has a relatively light computational load, only conventional combinations of basis sets and functionals can be selected before experiments instead of the best ones. Few studies have discussed these effects on the search performance of BO, and good search performance is highly dependent on the application. Therefore, we developed a method to improve the search performance of BO by using descriptors computed from several combinations of basis sets and functionals. The dataset obtained from averaging multiple descriptor sets exhibited better BO search performance than that of a single descriptor dataset. In addition, the more descriptor sets used for averaging, the better the search performance. This method has a relatively small computational load and can be easily used by those who are unfamiliar with quantum chemical calculations.
format Online
Article
Text
id pubmed-10500684
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-105006842023-09-15 Enhancing the Search Performance of Bayesian Optimization by Creating Different Descriptor Datasets Using Density Functional Theory Morishita, Toshiharu Kaneko, Hiromasa ACS Omega [Image: see text] Descriptors calculated from molecular structure information can be used as explanatory variables in Bayesian optimization (BO). Even though structural and descriptor information can be obtained from various databases for general compounds, information on highly confidential compounds such as pharmaceutical intermediates and active pharmaceutical ingredients cannot be retrieved from these databases. In particular, determining the stable structure and electronic state of a compound via quantum chemical calculations from descriptor information requires considerable computational time. Although descriptor information can be obtained using density functional theory (DFT), which has a relatively light computational load, only conventional combinations of basis sets and functionals can be selected before experiments instead of the best ones. Few studies have discussed these effects on the search performance of BO, and good search performance is highly dependent on the application. Therefore, we developed a method to improve the search performance of BO by using descriptors computed from several combinations of basis sets and functionals. The dataset obtained from averaging multiple descriptor sets exhibited better BO search performance than that of a single descriptor dataset. In addition, the more descriptor sets used for averaging, the better the search performance. This method has a relatively small computational load and can be easily used by those who are unfamiliar with quantum chemical calculations. American Chemical Society 2023-08-28 /pmc/articles/PMC10500684/ /pubmed/37720759 http://dx.doi.org/10.1021/acsomega.3c04891 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Morishita, Toshiharu
Kaneko, Hiromasa
Enhancing the Search Performance of Bayesian Optimization by Creating Different Descriptor Datasets Using Density Functional Theory
title Enhancing the Search Performance of Bayesian Optimization by Creating Different Descriptor Datasets Using Density Functional Theory
title_full Enhancing the Search Performance of Bayesian Optimization by Creating Different Descriptor Datasets Using Density Functional Theory
title_fullStr Enhancing the Search Performance of Bayesian Optimization by Creating Different Descriptor Datasets Using Density Functional Theory
title_full_unstemmed Enhancing the Search Performance of Bayesian Optimization by Creating Different Descriptor Datasets Using Density Functional Theory
title_short Enhancing the Search Performance of Bayesian Optimization by Creating Different Descriptor Datasets Using Density Functional Theory
title_sort enhancing the search performance of bayesian optimization by creating different descriptor datasets using density functional theory
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500684/
https://www.ncbi.nlm.nih.gov/pubmed/37720759
http://dx.doi.org/10.1021/acsomega.3c04891
work_keys_str_mv AT morishitatoshiharu enhancingthesearchperformanceofbayesianoptimizationbycreatingdifferentdescriptordatasetsusingdensityfunctionaltheory
AT kanekohiromasa enhancingthesearchperformanceofbayesianoptimizationbycreatingdifferentdescriptordatasetsusingdensityfunctionaltheory