Cargando…

Data-efficient machine learning for molecular crystal structure prediction

The combination of modern machine learning (ML) approaches with high-quality data from quantum mechanical (QM) calculations can yield models with an unrivalled accuracy/cost ratio. However, such methods are ultimately limited by the computational effort required to produce the reference data. In par...

Descripción completa

Detalles Bibliográficos
Autores principales: Wengert, Simon, Csányi, Gábor, Reuter, Karsten, Margraf, Johannes T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8179468/
https://www.ncbi.nlm.nih.gov/pubmed/34163719
http://dx.doi.org/10.1039/d0sc05765g
_version_ 1783703788343787520
author Wengert, Simon
Csányi, Gábor
Reuter, Karsten
Margraf, Johannes T.
author_facet Wengert, Simon
Csányi, Gábor
Reuter, Karsten
Margraf, Johannes T.
author_sort Wengert, Simon
collection PubMed
description The combination of modern machine learning (ML) approaches with high-quality data from quantum mechanical (QM) calculations can yield models with an unrivalled accuracy/cost ratio. However, such methods are ultimately limited by the computational effort required to produce the reference data. In particular, reference calculations for periodic systems with many atoms can become prohibitively expensive for higher levels of theory. This trade-off is critical in the context of organic crystal structure prediction (CSP). Here, a data-efficient ML approach would be highly desirable, since screening a huge space of possible polymorphs in a narrow energy range requires the assessment of a large number of trial structures with high accuracy. In this contribution, we present tailored Δ-ML models that allow screening a wide range of crystal candidates while adequately describing the subtle interplay between intermolecular interactions such as H-bonding and many-body dispersion effects. This is achieved by enhancing a physics-based description of long-range interactions at the density functional tight binding (DFTB) level—for which an efficient implementation is available—with a short-range ML model trained on high-quality first-principles reference data. The presented workflow is broadly applicable to different molecular materials, without the need for a single periodic calculation at the reference level of theory. We show that this even allows the use of wavefunction methods in CSP.
format Online
Article
Text
id pubmed-8179468
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-81794682021-06-22 Data-efficient machine learning for molecular crystal structure prediction Wengert, Simon Csányi, Gábor Reuter, Karsten Margraf, Johannes T. Chem Sci Chemistry The combination of modern machine learning (ML) approaches with high-quality data from quantum mechanical (QM) calculations can yield models with an unrivalled accuracy/cost ratio. However, such methods are ultimately limited by the computational effort required to produce the reference data. In particular, reference calculations for periodic systems with many atoms can become prohibitively expensive for higher levels of theory. This trade-off is critical in the context of organic crystal structure prediction (CSP). Here, a data-efficient ML approach would be highly desirable, since screening a huge space of possible polymorphs in a narrow energy range requires the assessment of a large number of trial structures with high accuracy. In this contribution, we present tailored Δ-ML models that allow screening a wide range of crystal candidates while adequately describing the subtle interplay between intermolecular interactions such as H-bonding and many-body dispersion effects. This is achieved by enhancing a physics-based description of long-range interactions at the density functional tight binding (DFTB) level—for which an efficient implementation is available—with a short-range ML model trained on high-quality first-principles reference data. The presented workflow is broadly applicable to different molecular materials, without the need for a single periodic calculation at the reference level of theory. We show that this even allows the use of wavefunction methods in CSP. The Royal Society of Chemistry 2021-02-11 /pmc/articles/PMC8179468/ /pubmed/34163719 http://dx.doi.org/10.1039/d0sc05765g Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/
spellingShingle Chemistry
Wengert, Simon
Csányi, Gábor
Reuter, Karsten
Margraf, Johannes T.
Data-efficient machine learning for molecular crystal structure prediction
title Data-efficient machine learning for molecular crystal structure prediction
title_full Data-efficient machine learning for molecular crystal structure prediction
title_fullStr Data-efficient machine learning for molecular crystal structure prediction
title_full_unstemmed Data-efficient machine learning for molecular crystal structure prediction
title_short Data-efficient machine learning for molecular crystal structure prediction
title_sort data-efficient machine learning for molecular crystal structure prediction
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8179468/
https://www.ncbi.nlm.nih.gov/pubmed/34163719
http://dx.doi.org/10.1039/d0sc05765g
work_keys_str_mv AT wengertsimon dataefficientmachinelearningformolecularcrystalstructureprediction
AT csanyigabor dataefficientmachinelearningformolecularcrystalstructureprediction
AT reuterkarsten dataefficientmachinelearningformolecularcrystalstructureprediction
AT margrafjohannest dataefficientmachinelearningformolecularcrystalstructureprediction