Cargando…

Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies

Porphyrin-based MOFs combine the unique photophysical and electrochemical properties of metalloporphyrins with the catalytic efficiency of MOF materials, making them an important candidate for light energy harvesting and conversion. However, accurate prediction of the band gap of porphyrin-based MOF...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Zhihui, Zhang, Chengwei, Zhang, Yutao, Deng, Shengwei, Yang, Yun-Fang, Su, An, She, Yuan-Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10243186/
https://www.ncbi.nlm.nih.gov/pubmed/37288371
http://dx.doi.org/10.1039/d3ra02142d
_version_ 1785054376573796352
author Zhang, Zhihui
Zhang, Chengwei
Zhang, Yutao
Deng, Shengwei
Yang, Yun-Fang
Su, An
She, Yuan-Bin
author_facet Zhang, Zhihui
Zhang, Chengwei
Zhang, Yutao
Deng, Shengwei
Yang, Yun-Fang
Su, An
She, Yuan-Bin
author_sort Zhang, Zhihui
collection PubMed
description Porphyrin-based MOFs combine the unique photophysical and electrochemical properties of metalloporphyrins with the catalytic efficiency of MOF materials, making them an important candidate for light energy harvesting and conversion. However, accurate prediction of the band gap of porphyrin-based MOFs is hampered by their complex structure–function relationships. Although machine learning (ML) has performed well in predicting the properties of MOFs with large training datasets, such ML applications become challenging when the training data size of the materials is small. In this study, we first constructed a dataset of 202 porphyrin-based MOFs using DFT computations and increased the training data size using two data augmentation strategies. After that, four state-of-the-art neural network models were pre-trained with the recognized open-source database QMOF and fine-tuned with our augmented self-curated datasets. The GCN models predicted the band gaps of the porphyrin-based materials with the lowest RMSE of 0.2767 eV and MAE of 0.1463 eV. In addition, the data augmentation strategy rotation and mirroring effectively decreased the RMSE by 38.51% and MAE by 50.05%. This study demonstrates that, when proper transfer learning and data augmentation strategies are applied, machine learning models can predict the properties of MOFs using small training data.
format Online
Article
Text
id pubmed-10243186
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-102431862023-06-07 Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies Zhang, Zhihui Zhang, Chengwei Zhang, Yutao Deng, Shengwei Yang, Yun-Fang Su, An She, Yuan-Bin RSC Adv Chemistry Porphyrin-based MOFs combine the unique photophysical and electrochemical properties of metalloporphyrins with the catalytic efficiency of MOF materials, making them an important candidate for light energy harvesting and conversion. However, accurate prediction of the band gap of porphyrin-based MOFs is hampered by their complex structure–function relationships. Although machine learning (ML) has performed well in predicting the properties of MOFs with large training datasets, such ML applications become challenging when the training data size of the materials is small. In this study, we first constructed a dataset of 202 porphyrin-based MOFs using DFT computations and increased the training data size using two data augmentation strategies. After that, four state-of-the-art neural network models were pre-trained with the recognized open-source database QMOF and fine-tuned with our augmented self-curated datasets. The GCN models predicted the band gaps of the porphyrin-based materials with the lowest RMSE of 0.2767 eV and MAE of 0.1463 eV. In addition, the data augmentation strategy rotation and mirroring effectively decreased the RMSE by 38.51% and MAE by 50.05%. This study demonstrates that, when proper transfer learning and data augmentation strategies are applied, machine learning models can predict the properties of MOFs using small training data. The Royal Society of Chemistry 2023-06-06 /pmc/articles/PMC10243186/ /pubmed/37288371 http://dx.doi.org/10.1039/d3ra02142d Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/
spellingShingle Chemistry
Zhang, Zhihui
Zhang, Chengwei
Zhang, Yutao
Deng, Shengwei
Yang, Yun-Fang
Su, An
She, Yuan-Bin
Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies
title Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies
title_full Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies
title_fullStr Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies
title_full_unstemmed Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies
title_short Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies
title_sort predicting band gaps of mofs on small data by deep transfer learning with data augmentation strategies
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10243186/
https://www.ncbi.nlm.nih.gov/pubmed/37288371
http://dx.doi.org/10.1039/d3ra02142d
work_keys_str_mv AT zhangzhihui predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies
AT zhangchengwei predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies
AT zhangyutao predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies
AT dengshengwei predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies
AT yangyunfang predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies
AT suan predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies
AT sheyuanbin predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies