Cargando…
Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies
Porphyrin-based MOFs combine the unique photophysical and electrochemical properties of metalloporphyrins with the catalytic efficiency of MOF materials, making them an important candidate for light energy harvesting and conversion. However, accurate prediction of the band gap of porphyrin-based MOF...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society of Chemistry
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10243186/ https://www.ncbi.nlm.nih.gov/pubmed/37288371 http://dx.doi.org/10.1039/d3ra02142d |
_version_ | 1785054376573796352 |
---|---|
author | Zhang, Zhihui Zhang, Chengwei Zhang, Yutao Deng, Shengwei Yang, Yun-Fang Su, An She, Yuan-Bin |
author_facet | Zhang, Zhihui Zhang, Chengwei Zhang, Yutao Deng, Shengwei Yang, Yun-Fang Su, An She, Yuan-Bin |
author_sort | Zhang, Zhihui |
collection | PubMed |
description | Porphyrin-based MOFs combine the unique photophysical and electrochemical properties of metalloporphyrins with the catalytic efficiency of MOF materials, making them an important candidate for light energy harvesting and conversion. However, accurate prediction of the band gap of porphyrin-based MOFs is hampered by their complex structure–function relationships. Although machine learning (ML) has performed well in predicting the properties of MOFs with large training datasets, such ML applications become challenging when the training data size of the materials is small. In this study, we first constructed a dataset of 202 porphyrin-based MOFs using DFT computations and increased the training data size using two data augmentation strategies. After that, four state-of-the-art neural network models were pre-trained with the recognized open-source database QMOF and fine-tuned with our augmented self-curated datasets. The GCN models predicted the band gaps of the porphyrin-based materials with the lowest RMSE of 0.2767 eV and MAE of 0.1463 eV. In addition, the data augmentation strategy rotation and mirroring effectively decreased the RMSE by 38.51% and MAE by 50.05%. This study demonstrates that, when proper transfer learning and data augmentation strategies are applied, machine learning models can predict the properties of MOFs using small training data. |
format | Online Article Text |
id | pubmed-10243186 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | The Royal Society of Chemistry |
record_format | MEDLINE/PubMed |
spelling | pubmed-102431862023-06-07 Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies Zhang, Zhihui Zhang, Chengwei Zhang, Yutao Deng, Shengwei Yang, Yun-Fang Su, An She, Yuan-Bin RSC Adv Chemistry Porphyrin-based MOFs combine the unique photophysical and electrochemical properties of metalloporphyrins with the catalytic efficiency of MOF materials, making them an important candidate for light energy harvesting and conversion. However, accurate prediction of the band gap of porphyrin-based MOFs is hampered by their complex structure–function relationships. Although machine learning (ML) has performed well in predicting the properties of MOFs with large training datasets, such ML applications become challenging when the training data size of the materials is small. In this study, we first constructed a dataset of 202 porphyrin-based MOFs using DFT computations and increased the training data size using two data augmentation strategies. After that, four state-of-the-art neural network models were pre-trained with the recognized open-source database QMOF and fine-tuned with our augmented self-curated datasets. The GCN models predicted the band gaps of the porphyrin-based materials with the lowest RMSE of 0.2767 eV and MAE of 0.1463 eV. In addition, the data augmentation strategy rotation and mirroring effectively decreased the RMSE by 38.51% and MAE by 50.05%. This study demonstrates that, when proper transfer learning and data augmentation strategies are applied, machine learning models can predict the properties of MOFs using small training data. The Royal Society of Chemistry 2023-06-06 /pmc/articles/PMC10243186/ /pubmed/37288371 http://dx.doi.org/10.1039/d3ra02142d Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/ |
spellingShingle | Chemistry Zhang, Zhihui Zhang, Chengwei Zhang, Yutao Deng, Shengwei Yang, Yun-Fang Su, An She, Yuan-Bin Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies |
title | Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies |
title_full | Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies |
title_fullStr | Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies |
title_full_unstemmed | Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies |
title_short | Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies |
title_sort | predicting band gaps of mofs on small data by deep transfer learning with data augmentation strategies |
topic | Chemistry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10243186/ https://www.ncbi.nlm.nih.gov/pubmed/37288371 http://dx.doi.org/10.1039/d3ra02142d |
work_keys_str_mv | AT zhangzhihui predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies AT zhangchengwei predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies AT zhangyutao predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies AT dengshengwei predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies AT yangyunfang predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies AT suan predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies AT sheyuanbin predictingbandgapsofmofsonsmalldatabydeeptransferlearningwithdataaugmentationstrategies |