Cargando…

Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies

We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor a...

Descripción completa

Detalles Bibliográficos
Autores principales: Bauer, Christoph A., Schneider, Gisbert, Göller, Andreas H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6737620/
https://www.ncbi.nlm.nih.gov/pubmed/33430967
http://dx.doi.org/10.1186/s13321-019-0381-4
_version_ 1783450690572517376
author Bauer, Christoph A.
Schneider, Gisbert
Göller, Andreas H.
author_facet Bauer, Christoph A.
Schneider, Gisbert
Göller, Andreas H.
author_sort Bauer, Christoph A.
collection PubMed
description We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 kJ mol(−1) (acceptors), and 2.3 kJ mol(−1) (donors) on experimental test sets, respectively. This performance is comparable with previous models that are trained on experimental hydrogen bonding free energies, indicating that molecular QC data can serve as substitute for experiment. The potential ramifications thereof could lead to a full replacement of wetlab chemistry for HBA/HBD strength determination by QC. As a possible chemical application of our ML models, we highlight our predicted HBA and HBD strengths as possible descriptors in two case studies on trends in intramolecular hydrogen bonding. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13321-019-0381-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6737620
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-67376202019-09-16 Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies Bauer, Christoph A. Schneider, Gisbert Göller, Andreas H. J Cheminform Research Article We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 kJ mol(−1) (acceptors), and 2.3 kJ mol(−1) (donors) on experimental test sets, respectively. This performance is comparable with previous models that are trained on experimental hydrogen bonding free energies, indicating that molecular QC data can serve as substitute for experiment. The potential ramifications thereof could lead to a full replacement of wetlab chemistry for HBA/HBD strength determination by QC. As a possible chemical application of our ML models, we highlight our predicted HBA and HBD strengths as possible descriptors in two case studies on trends in intramolecular hydrogen bonding. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13321-019-0381-4) contains supplementary material, which is available to authorized users. Springer International Publishing 2019-09-11 /pmc/articles/PMC6737620/ /pubmed/33430967 http://dx.doi.org/10.1186/s13321-019-0381-4 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Bauer, Christoph A.
Schneider, Gisbert
Göller, Andreas H.
Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_full Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_fullStr Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_full_unstemmed Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_short Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_sort machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6737620/
https://www.ncbi.nlm.nih.gov/pubmed/33430967
http://dx.doi.org/10.1186/s13321-019-0381-4
work_keys_str_mv AT bauerchristopha machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies
AT schneidergisbert machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies
AT gollerandreash machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies