Cargando…

MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations

The use of machine learning techniques in computational chemistry has gained significant momentum since large molecular databases are now readily available. Predictions of molecular properties using machine learning have advantages over the traditional quantum mechanics calculations because they can...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Sanha, Ermanis, Kristaps, Goodman, Jonathan M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9214916/
https://www.ncbi.nlm.nih.gov/pubmed/35799803
http://dx.doi.org/10.1039/d1sc06324c
_version_ 1784731109444026368
author Lee, Sanha
Ermanis, Kristaps
Goodman, Jonathan M.
author_facet Lee, Sanha
Ermanis, Kristaps
Goodman, Jonathan M.
author_sort Lee, Sanha
collection PubMed
description The use of machine learning techniques in computational chemistry has gained significant momentum since large molecular databases are now readily available. Predictions of molecular properties using machine learning have advantages over the traditional quantum mechanics calculations because they can be cheaper computationally without losing the accuracy. We present a new extrapolatable and explainable molecular representation based on bonds, angles and dihedrals that can be used to train machine learning models. The trained models can accurately predict the electronic energy and the free energy of small organic molecules with atom types C, H N and O, with a mean absolute error of 1.2 kcal mol(−1). The models can be extrapolated to larger organic molecules with an average error of less than 3.7 kcal mol(−1) for 10 or fewer heavy atoms, which represent a chemical space two orders of magnitude larger. The rapid energy predictions of multiple molecules, up to 7 times faster than previous ML models of similar accuracy, has been achieved by sampling geometries around the potential energy surface minima. Therefore, the input geometries do not have to be located precisely on the minima and we show that accurate density functional theory energy predictions can be made from force-field optimised geometries with a mean absolute error 2.5 kcal mol(−1).
format Online
Article
Text
id pubmed-9214916
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-92149162022-07-06 MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations Lee, Sanha Ermanis, Kristaps Goodman, Jonathan M. Chem Sci Chemistry The use of machine learning techniques in computational chemistry has gained significant momentum since large molecular databases are now readily available. Predictions of molecular properties using machine learning have advantages over the traditional quantum mechanics calculations because they can be cheaper computationally without losing the accuracy. We present a new extrapolatable and explainable molecular representation based on bonds, angles and dihedrals that can be used to train machine learning models. The trained models can accurately predict the electronic energy and the free energy of small organic molecules with atom types C, H N and O, with a mean absolute error of 1.2 kcal mol(−1). The models can be extrapolated to larger organic molecules with an average error of less than 3.7 kcal mol(−1) for 10 or fewer heavy atoms, which represent a chemical space two orders of magnitude larger. The rapid energy predictions of multiple molecules, up to 7 times faster than previous ML models of similar accuracy, has been achieved by sampling geometries around the potential energy surface minima. Therefore, the input geometries do not have to be located precisely on the minima and we show that accurate density functional theory energy predictions can be made from force-field optimised geometries with a mean absolute error 2.5 kcal mol(−1). The Royal Society of Chemistry 2022-05-28 /pmc/articles/PMC9214916/ /pubmed/35799803 http://dx.doi.org/10.1039/d1sc06324c Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/
spellingShingle Chemistry
Lee, Sanha
Ermanis, Kristaps
Goodman, Jonathan M.
MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations
title MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations
title_full MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations
title_fullStr MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations
title_full_unstemmed MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations
title_short MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations
title_sort mole8: finding dft potential energy surface minima values from force-field optimised organic molecules with new machine learning representations
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9214916/
https://www.ncbi.nlm.nih.gov/pubmed/35799803
http://dx.doi.org/10.1039/d1sc06324c
work_keys_str_mv AT leesanha mole8findingdftpotentialenergysurfaceminimavaluesfromforcefieldoptimisedorganicmoleculeswithnewmachinelearningrepresentations
AT ermaniskristaps mole8findingdftpotentialenergysurfaceminimavaluesfromforcefieldoptimisedorganicmoleculeswithnewmachinelearningrepresentations
AT goodmanjonathanm mole8findingdftpotentialenergysurfaceminimavaluesfromforcefieldoptimisedorganicmoleculeswithnewmachinelearningrepresentations