Cargando…

XGBoost: An Optimal Machine Learning Model with Just Structural Features to Discover MOF Adsorbents of Xe/Kr

[Image: see text] The inert gases Xe and Kr mainly exist in the used nuclear fuel (UNF) with the Xe/Kr ratio of 20:80, which it is difficult to separate. In this work, based on the G-MOFs database, high-throughput computational screening for metal–organic frameworks (MOFs) with high Xe/Kr adsorption...

Descripción completa

Detalles Bibliográficos
Autores principales: Liang, Heng, Jiang, Kun, Yan, Tong-An, Chen, Guang-Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8028164/
https://www.ncbi.nlm.nih.gov/pubmed/33842776
http://dx.doi.org/10.1021/acsomega.1c00100
_version_ 1783675935841583104
author Liang, Heng
Jiang, Kun
Yan, Tong-An
Chen, Guang-Hui
author_facet Liang, Heng
Jiang, Kun
Yan, Tong-An
Chen, Guang-Hui
author_sort Liang, Heng
collection PubMed
description [Image: see text] The inert gases Xe and Kr mainly exist in the used nuclear fuel (UNF) with the Xe/Kr ratio of 20:80, which it is difficult to separate. In this work, based on the G-MOFs database, high-throughput computational screening for metal–organic frameworks (MOFs) with high Xe/Kr adsorption selectivity was performed by combining grand canonical Monte Carlo (GCMC) simulations and machine learning (ML) technique for the first time. From the comparison of eight classical ML models, it is found that the XGBoost model with seven structural descriptors has superior accuracy in predicting the adsorption and separation performance of MOFs to Xe/Kr. Compared with energetic or electronic descriptors, structural descriptors are easier to obtain. Note that the determination coefficients R(2) of the generalized model for the Xe adsorption and Xe/Kr selectivity are very close to 1, at 0.951 and 0.973, respectively. In addition, 888 and 896 MOFs have been successfully predicted by the XGBoost model among the top 1000 MOFs in adsorption capacity and selectivity by GCMC simulation, respectively. According to the feature engineering of the XGBoost model, it is shown that the density (ρ), porosity (ϕ), pore volume (Vol), and pore limiting diameter (PLD) of MOFs are the key features that affect the Xe/Kr adsorption property. To test the generalization ability of the XGBoost model, we also tried to screen MOF adsorbents on the CO(2)/CH(4) mixture, it is found that the prediction performance of XGBoost is also much better than that of the traditional machine learning models although with the unbalanced data. Note that the dimension of features of MOFs is low while the quantity of MOF samples in database is very large, which is suitable for the prediction by model such as XGBoost to search the global minimum of cost function rather than the model involving feature creation. The present study represents the first report using the XGBoost algorithm to discover the MOF adsorbates.
format Online
Article
Text
id pubmed-8028164
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-80281642021-04-09 XGBoost: An Optimal Machine Learning Model with Just Structural Features to Discover MOF Adsorbents of Xe/Kr Liang, Heng Jiang, Kun Yan, Tong-An Chen, Guang-Hui ACS Omega [Image: see text] The inert gases Xe and Kr mainly exist in the used nuclear fuel (UNF) with the Xe/Kr ratio of 20:80, which it is difficult to separate. In this work, based on the G-MOFs database, high-throughput computational screening for metal–organic frameworks (MOFs) with high Xe/Kr adsorption selectivity was performed by combining grand canonical Monte Carlo (GCMC) simulations and machine learning (ML) technique for the first time. From the comparison of eight classical ML models, it is found that the XGBoost model with seven structural descriptors has superior accuracy in predicting the adsorption and separation performance of MOFs to Xe/Kr. Compared with energetic or electronic descriptors, structural descriptors are easier to obtain. Note that the determination coefficients R(2) of the generalized model for the Xe adsorption and Xe/Kr selectivity are very close to 1, at 0.951 and 0.973, respectively. In addition, 888 and 896 MOFs have been successfully predicted by the XGBoost model among the top 1000 MOFs in adsorption capacity and selectivity by GCMC simulation, respectively. According to the feature engineering of the XGBoost model, it is shown that the density (ρ), porosity (ϕ), pore volume (Vol), and pore limiting diameter (PLD) of MOFs are the key features that affect the Xe/Kr adsorption property. To test the generalization ability of the XGBoost model, we also tried to screen MOF adsorbents on the CO(2)/CH(4) mixture, it is found that the prediction performance of XGBoost is also much better than that of the traditional machine learning models although with the unbalanced data. Note that the dimension of features of MOFs is low while the quantity of MOF samples in database is very large, which is suitable for the prediction by model such as XGBoost to search the global minimum of cost function rather than the model involving feature creation. The present study represents the first report using the XGBoost algorithm to discover the MOF adsorbates. American Chemical Society 2021-03-19 /pmc/articles/PMC8028164/ /pubmed/33842776 http://dx.doi.org/10.1021/acsomega.1c00100 Text en © 2021 The Authors. Published by American Chemical Society Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Liang, Heng
Jiang, Kun
Yan, Tong-An
Chen, Guang-Hui
XGBoost: An Optimal Machine Learning Model with Just Structural Features to Discover MOF Adsorbents of Xe/Kr
title XGBoost: An Optimal Machine Learning Model with Just Structural Features to Discover MOF Adsorbents of Xe/Kr
title_full XGBoost: An Optimal Machine Learning Model with Just Structural Features to Discover MOF Adsorbents of Xe/Kr
title_fullStr XGBoost: An Optimal Machine Learning Model with Just Structural Features to Discover MOF Adsorbents of Xe/Kr
title_full_unstemmed XGBoost: An Optimal Machine Learning Model with Just Structural Features to Discover MOF Adsorbents of Xe/Kr
title_short XGBoost: An Optimal Machine Learning Model with Just Structural Features to Discover MOF Adsorbents of Xe/Kr
title_sort xgboost: an optimal machine learning model with just structural features to discover mof adsorbents of xe/kr
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8028164/
https://www.ncbi.nlm.nih.gov/pubmed/33842776
http://dx.doi.org/10.1021/acsomega.1c00100
work_keys_str_mv AT liangheng xgboostanoptimalmachinelearningmodelwithjuststructuralfeaturestodiscovermofadsorbentsofxekr
AT jiangkun xgboostanoptimalmachinelearningmodelwithjuststructuralfeaturestodiscovermofadsorbentsofxekr
AT yantongan xgboostanoptimalmachinelearningmodelwithjuststructuralfeaturestodiscovermofadsorbentsofxekr
AT chenguanghui xgboostanoptimalmachinelearningmodelwithjuststructuralfeaturestodiscovermofadsorbentsofxekr