Cargando…

Enabling deeper learning on big data for materials informatics applications

The application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of...

Descripción completa

Detalles Bibliográficos
Autores principales: Jha, Dipendra, Gupta, Vishu, Ward, Logan, Yang, Zijiang, Wolverton, Christopher, Foster, Ian, Liao, Wei-keng, Choudhary, Alok, Agrawal, Ankit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7895970/
https://www.ncbi.nlm.nih.gov/pubmed/33608599
http://dx.doi.org/10.1038/s41598-021-83193-1
_version_ 1783653465494388736
author Jha, Dipendra
Gupta, Vishu
Ward, Logan
Yang, Zijiang
Wolverton, Christopher
Foster, Ian
Liao, Wei-keng
Choudhary, Alok
Agrawal, Ankit
author_facet Jha, Dipendra
Gupta, Vishu
Ward, Logan
Yang, Zijiang
Wolverton, Christopher
Foster, Ian
Liao, Wei-keng
Choudhary, Alok
Agrawal, Ankit
author_sort Jha, Dipendra
collection PubMed
description The application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.
format Online
Article
Text
id pubmed-7895970
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-78959702021-02-24 Enabling deeper learning on big data for materials informatics applications Jha, Dipendra Gupta, Vishu Ward, Logan Yang, Zijiang Wolverton, Christopher Foster, Ian Liao, Wei-keng Choudhary, Alok Agrawal, Ankit Sci Rep Article The application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data. Nature Publishing Group UK 2021-02-19 /pmc/articles/PMC7895970/ /pubmed/33608599 http://dx.doi.org/10.1038/s41598-021-83193-1 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Jha, Dipendra
Gupta, Vishu
Ward, Logan
Yang, Zijiang
Wolverton, Christopher
Foster, Ian
Liao, Wei-keng
Choudhary, Alok
Agrawal, Ankit
Enabling deeper learning on big data for materials informatics applications
title Enabling deeper learning on big data for materials informatics applications
title_full Enabling deeper learning on big data for materials informatics applications
title_fullStr Enabling deeper learning on big data for materials informatics applications
title_full_unstemmed Enabling deeper learning on big data for materials informatics applications
title_short Enabling deeper learning on big data for materials informatics applications
title_sort enabling deeper learning on big data for materials informatics applications
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7895970/
https://www.ncbi.nlm.nih.gov/pubmed/33608599
http://dx.doi.org/10.1038/s41598-021-83193-1
work_keys_str_mv AT jhadipendra enablingdeeperlearningonbigdataformaterialsinformaticsapplications
AT guptavishu enablingdeeperlearningonbigdataformaterialsinformaticsapplications
AT wardlogan enablingdeeperlearningonbigdataformaterialsinformaticsapplications
AT yangzijiang enablingdeeperlearningonbigdataformaterialsinformaticsapplications
AT wolvertonchristopher enablingdeeperlearningonbigdataformaterialsinformaticsapplications
AT fosterian enablingdeeperlearningonbigdataformaterialsinformaticsapplications
AT liaoweikeng enablingdeeperlearningonbigdataformaterialsinformaticsapplications
AT choudharyalok enablingdeeperlearningonbigdataformaterialsinformaticsapplications
AT agrawalankit enablingdeeperlearningonbigdataformaterialsinformaticsapplications