Cargando…

Accurate, interpretable predictions of materials properties within transformer language models

Property prediction accuracy has long been a key parameter of machine learning in materials informatics. Accordingly, advanced models showing state-of-the-art performance turn into highly parameterized black boxes missing interpretability. Here, we present an elegant way to make their reasoning tran...

Descripción completa

Detalles Bibliográficos
Autores principales: Korolev, Vadim, Protsenko, Pavel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10591138/
https://www.ncbi.nlm.nih.gov/pubmed/37876904
http://dx.doi.org/10.1016/j.patter.2023.100803
_version_ 1785124160111902720
author Korolev, Vadim
Protsenko, Pavel
author_facet Korolev, Vadim
Protsenko, Pavel
author_sort Korolev, Vadim
collection PubMed
description Property prediction accuracy has long been a key parameter of machine learning in materials informatics. Accordingly, advanced models showing state-of-the-art performance turn into highly parameterized black boxes missing interpretability. Here, we present an elegant way to make their reasoning transparent. Human-readable text-based descriptions automatically generated within a suite of open-source tools are proposed as materials representation. Transformer language models pretrained on 2 million peer-reviewed articles take as input well-known terms such as chemical composition, crystal symmetry, and site geometry. Our approach outperforms crystal graph networks by classifying four out of five analyzed properties if one considers all available reference data. Moreover, fine-tuned text-based models show high accuracy in the ultra-small data limit. Explanations of their internal machinery are produced using local interpretability techniques and are faithful and consistent with domain expert rationales. This language-centric framework makes accurate property predictions accessible to people without artificial-intelligence expertise.
format Online
Article
Text
id pubmed-10591138
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-105911382023-10-24 Accurate, interpretable predictions of materials properties within transformer language models Korolev, Vadim Protsenko, Pavel Patterns (N Y) Article Property prediction accuracy has long been a key parameter of machine learning in materials informatics. Accordingly, advanced models showing state-of-the-art performance turn into highly parameterized black boxes missing interpretability. Here, we present an elegant way to make their reasoning transparent. Human-readable text-based descriptions automatically generated within a suite of open-source tools are proposed as materials representation. Transformer language models pretrained on 2 million peer-reviewed articles take as input well-known terms such as chemical composition, crystal symmetry, and site geometry. Our approach outperforms crystal graph networks by classifying four out of five analyzed properties if one considers all available reference data. Moreover, fine-tuned text-based models show high accuracy in the ultra-small data limit. Explanations of their internal machinery are produced using local interpretability techniques and are faithful and consistent with domain expert rationales. This language-centric framework makes accurate property predictions accessible to people without artificial-intelligence expertise. Elsevier 2023-08-02 /pmc/articles/PMC10591138/ /pubmed/37876904 http://dx.doi.org/10.1016/j.patter.2023.100803 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Korolev, Vadim
Protsenko, Pavel
Accurate, interpretable predictions of materials properties within transformer language models
title Accurate, interpretable predictions of materials properties within transformer language models
title_full Accurate, interpretable predictions of materials properties within transformer language models
title_fullStr Accurate, interpretable predictions of materials properties within transformer language models
title_full_unstemmed Accurate, interpretable predictions of materials properties within transformer language models
title_short Accurate, interpretable predictions of materials properties within transformer language models
title_sort accurate, interpretable predictions of materials properties within transformer language models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10591138/
https://www.ncbi.nlm.nih.gov/pubmed/37876904
http://dx.doi.org/10.1016/j.patter.2023.100803
work_keys_str_mv AT korolevvadim accurateinterpretablepredictionsofmaterialspropertieswithintransformerlanguagemodels
AT protsenkopavel accurateinterpretablepredictionsofmaterialspropertieswithintransformerlanguagemodels