Cargando…

Selected machine learning of HOMO–LUMO gaps with improved data-efficiency

Despite their relevance for organic electronics, quantum machine learning (QML) models of molecular electronic properties, such as HOMO–LUMO-gaps, often struggle to achieve satisfying data-efficiency as measured by decreasing prediction errors for increasing training set sizes. We demonstrate that p...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mazouin, Bernard, Schöpfer, Alexandre Alain, von Lilienfeld, O. Anatole
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	RSC 2022
Materias:	Chemistry
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9662596/ https://www.ncbi.nlm.nih.gov/pubmed/36561279 http://dx.doi.org/10.1039/d2ma00742h

_version_	1784830718436704256
author	Mazouin, Bernard Schöpfer, Alexandre Alain von Lilienfeld, O. Anatole
author_facet	Mazouin, Bernard Schöpfer, Alexandre Alain von Lilienfeld, O. Anatole
author_sort	Mazouin, Bernard
collection	PubMed
description	Despite their relevance for organic electronics, quantum machine learning (QML) models of molecular electronic properties, such as HOMO–LUMO-gaps, often struggle to achieve satisfying data-efficiency as measured by decreasing prediction errors for increasing training set sizes. We demonstrate that partitioning training sets into different chemical classes prior to training results in independently trained QML models with overall reduced training data needs. For organic molecules drawn from previously published QM7 and QM9-data-sets we have identified and exploited three relevant classes corresponding to compounds containing either aromatic rings and carbonyl groups, or single unsaturated bonds, or saturated bonds The selected QML models of band-gaps (considered at GW and hybrid DFT levels of theory) reach mean absolute prediction errors of ∼0.1 eV for up to an order of magnitude fewer training molecules than for QML models trained on randomly selected molecules. Comparison to Δ-QML models of band-gaps indicates that selected QML exhibit superior data-efficiency. Our findings suggest that selected QML, e.g. based on simple classifications prior to training, could help to successfully tackle challenging quantum property screening tasks of large libraries with high fidelity and low computational burden.
format	Online Article Text
id	pubmed-9662596
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	RSC
record_format	MEDLINE/PubMed
spelling	pubmed-96625962022-12-20 Selected machine learning of HOMO–LUMO gaps with improved data-efficiency Mazouin, Bernard Schöpfer, Alexandre Alain von Lilienfeld, O. Anatole Mater Adv Chemistry Despite their relevance for organic electronics, quantum machine learning (QML) models of molecular electronic properties, such as HOMO–LUMO-gaps, often struggle to achieve satisfying data-efficiency as measured by decreasing prediction errors for increasing training set sizes. We demonstrate that partitioning training sets into different chemical classes prior to training results in independently trained QML models with overall reduced training data needs. For organic molecules drawn from previously published QM7 and QM9-data-sets we have identified and exploited three relevant classes corresponding to compounds containing either aromatic rings and carbonyl groups, or single unsaturated bonds, or saturated bonds The selected QML models of band-gaps (considered at GW and hybrid DFT levels of theory) reach mean absolute prediction errors of ∼0.1 eV for up to an order of magnitude fewer training molecules than for QML models trained on randomly selected molecules. Comparison to Δ-QML models of band-gaps indicates that selected QML exhibit superior data-efficiency. Our findings suggest that selected QML, e.g. based on simple classifications prior to training, could help to successfully tackle challenging quantum property screening tasks of large libraries with high fidelity and low computational burden. RSC 2022-09-20 /pmc/articles/PMC9662596/ /pubmed/36561279 http://dx.doi.org/10.1039/d2ma00742h Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/
spellingShingle	Chemistry Mazouin, Bernard Schöpfer, Alexandre Alain von Lilienfeld, O. Anatole Selected machine learning of HOMO–LUMO gaps with improved data-efficiency
title	Selected machine learning of HOMO–LUMO gaps with improved data-efficiency
title_full	Selected machine learning of HOMO–LUMO gaps with improved data-efficiency
title_fullStr	Selected machine learning of HOMO–LUMO gaps with improved data-efficiency
title_full_unstemmed	Selected machine learning of HOMO–LUMO gaps with improved data-efficiency
title_short	Selected machine learning of HOMO–LUMO gaps with improved data-efficiency
title_sort	selected machine learning of homo–lumo gaps with improved data-efficiency
topic	Chemistry
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9662596/ https://www.ncbi.nlm.nih.gov/pubmed/36561279 http://dx.doi.org/10.1039/d2ma00742h
work_keys_str_mv	AT mazouinbernard selectedmachinelearningofhomolumogapswithimproveddataefficiency AT schopferalexandrealain selectedmachinelearningofhomolumogapswithimproveddataefficiency AT vonlilienfeldoanatole selectedmachinelearningofhomolumogapswithimproveddataefficiency

Selected machine learning of HOMO–LUMO gaps with improved data-efficiency

Ejemplares similares