Cargando…

Generative Models for Extrapolation Prediction in Materials Informatics

[Image: see text] We report a deep generative model for regression tasks in materials informatics. The model is introduced as a component of a data imputer and predicts more than 20 diverse experimental properties of organic molecules. The imputer is designed to predict material properties by “imagi...

Descripción completa

Detalles Bibliográficos
Autores principales: Hatakeyama-Sato, Kan, Oyaizu, Kenichi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8190893/
https://www.ncbi.nlm.nih.gov/pubmed/34124480
http://dx.doi.org/10.1021/acsomega.1c01716
_version_ 1783705774418034688
author Hatakeyama-Sato, Kan
Oyaizu, Kenichi
author_facet Hatakeyama-Sato, Kan
Oyaizu, Kenichi
author_sort Hatakeyama-Sato, Kan
collection PubMed
description [Image: see text] We report a deep generative model for regression tasks in materials informatics. The model is introduced as a component of a data imputer and predicts more than 20 diverse experimental properties of organic molecules. The imputer is designed to predict material properties by “imagining” the missing data in the database, enabling the use of incomplete material data. Even removing 60% of the data does not diminish the prediction accuracy in a model task. Moreover, the model excels at extrapolation prediction, where target values of the test data are out of the range of the training data. Such an extrapolation has been regarded as an essential technique for exploring novel materials but has hardly been studied to date due to its difficulty. We demonstrate that the prediction performance can be improved by >30% by using the imputer compared with traditional linear regression and boosting models. The benefit becomes especially pronounced with few records for an experimental property (<100 cases) when prediction would be difficult by conventional methods. The presented approach can be used to more efficiently explore functional materials and break through previous performance limits.
format Online
Article
Text
id pubmed-8190893
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-81908932021-06-11 Generative Models for Extrapolation Prediction in Materials Informatics Hatakeyama-Sato, Kan Oyaizu, Kenichi ACS Omega [Image: see text] We report a deep generative model for regression tasks in materials informatics. The model is introduced as a component of a data imputer and predicts more than 20 diverse experimental properties of organic molecules. The imputer is designed to predict material properties by “imagining” the missing data in the database, enabling the use of incomplete material data. Even removing 60% of the data does not diminish the prediction accuracy in a model task. Moreover, the model excels at extrapolation prediction, where target values of the test data are out of the range of the training data. Such an extrapolation has been regarded as an essential technique for exploring novel materials but has hardly been studied to date due to its difficulty. We demonstrate that the prediction performance can be improved by >30% by using the imputer compared with traditional linear regression and boosting models. The benefit becomes especially pronounced with few records for an experimental property (<100 cases) when prediction would be difficult by conventional methods. The presented approach can be used to more efficiently explore functional materials and break through previous performance limits. American Chemical Society 2021-05-25 /pmc/articles/PMC8190893/ /pubmed/34124480 http://dx.doi.org/10.1021/acsomega.1c01716 Text en © 2021 The Authors. Published by American Chemical Society Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Hatakeyama-Sato, Kan
Oyaizu, Kenichi
Generative Models for Extrapolation Prediction in Materials Informatics
title Generative Models for Extrapolation Prediction in Materials Informatics
title_full Generative Models for Extrapolation Prediction in Materials Informatics
title_fullStr Generative Models for Extrapolation Prediction in Materials Informatics
title_full_unstemmed Generative Models for Extrapolation Prediction in Materials Informatics
title_short Generative Models for Extrapolation Prediction in Materials Informatics
title_sort generative models for extrapolation prediction in materials informatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8190893/
https://www.ncbi.nlm.nih.gov/pubmed/34124480
http://dx.doi.org/10.1021/acsomega.1c01716
work_keys_str_mv AT hatakeyamasatokan generativemodelsforextrapolationpredictioninmaterialsinformatics
AT oyaizukenichi generativemodelsforextrapolationpredictioninmaterialsinformatics