Cargando…

Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning

[Image: see text] Covalent organic frameworks (COFs) have the advantages of high thermal stability and large specific surface and have great application prospects in the fields of gas storage and catalysis. This article mainly focuses on COFs’ working capacity of methane (CH(4)). Due to the vast num...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Peisong, Zhang, Huan, Lai, Xin, Wang, Kunfeng, Yang, Qingyuan, Yu, Duli
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8280634/
https://www.ncbi.nlm.nih.gov/pubmed/34278102
http://dx.doi.org/10.1021/acsomega.0c05990
_version_ 1783722676122025984
author Yang, Peisong
Zhang, Huan
Lai, Xin
Wang, Kunfeng
Yang, Qingyuan
Yu, Duli
author_facet Yang, Peisong
Zhang, Huan
Lai, Xin
Wang, Kunfeng
Yang, Qingyuan
Yu, Duli
author_sort Yang, Peisong
collection PubMed
description [Image: see text] Covalent organic frameworks (COFs) have the advantages of high thermal stability and large specific surface and have great application prospects in the fields of gas storage and catalysis. This article mainly focuses on COFs’ working capacity of methane (CH(4)). Due to the vast number of possible COF structures, it is time-consuming to use traditional calculation methods to find suitable materials, so it is important to apply appropriate machine learning (ML) algorithms to build accurate prediction models. A major obstacle for the use of ML algorithms is that the performance of an algorithm may be affected by many design decisions. Finding appropriate algorithm and model parameters is quite a challenge for nonprofessionals. In this work, we use automated machine learning (AutoML) to analyze the working capacity of CH(4) based on 403,959 COFs. We explore the relationship between 23 features such as the structure, chemical characteristics, atom types of COFs, and the working capacity. Then, the tree-based pipeline optimization tool (TPOT) in AutoML and the traditional ML methods including multiple linear regression, support vector machine, decision tree, and random forest that manually set model parameters are compared. It is found that the TPOT can not only save complex data preprocessing and model parameter tuning but also show higher performance than traditional ML models. Compared with traditional grand canonical Monte Carlo simulations, it can save a lot of time. AutoML has broken through the limitations of professionals so that researchers in nonprofessional fields can realize automatic parameter configuration for experiments to obtain highly accurate and easy-to-understand results, which is of great significance for material screening.
format Online
Article
Text
id pubmed-8280634
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-82806342021-07-16 Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning Yang, Peisong Zhang, Huan Lai, Xin Wang, Kunfeng Yang, Qingyuan Yu, Duli ACS Omega [Image: see text] Covalent organic frameworks (COFs) have the advantages of high thermal stability and large specific surface and have great application prospects in the fields of gas storage and catalysis. This article mainly focuses on COFs’ working capacity of methane (CH(4)). Due to the vast number of possible COF structures, it is time-consuming to use traditional calculation methods to find suitable materials, so it is important to apply appropriate machine learning (ML) algorithms to build accurate prediction models. A major obstacle for the use of ML algorithms is that the performance of an algorithm may be affected by many design decisions. Finding appropriate algorithm and model parameters is quite a challenge for nonprofessionals. In this work, we use automated machine learning (AutoML) to analyze the working capacity of CH(4) based on 403,959 COFs. We explore the relationship between 23 features such as the structure, chemical characteristics, atom types of COFs, and the working capacity. Then, the tree-based pipeline optimization tool (TPOT) in AutoML and the traditional ML methods including multiple linear regression, support vector machine, decision tree, and random forest that manually set model parameters are compared. It is found that the TPOT can not only save complex data preprocessing and model parameter tuning but also show higher performance than traditional ML models. Compared with traditional grand canonical Monte Carlo simulations, it can save a lot of time. AutoML has broken through the limitations of professionals so that researchers in nonprofessional fields can realize automatic parameter configuration for experiments to obtain highly accurate and easy-to-understand results, which is of great significance for material screening. American Chemical Society 2021-06-25 /pmc/articles/PMC8280634/ /pubmed/34278102 http://dx.doi.org/10.1021/acsomega.0c05990 Text en © 2021 The Authors. Published by American Chemical Society Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Yang, Peisong
Zhang, Huan
Lai, Xin
Wang, Kunfeng
Yang, Qingyuan
Yu, Duli
Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning
title Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning
title_full Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning
title_fullStr Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning
title_full_unstemmed Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning
title_short Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning
title_sort accelerating the selection of covalent organic frameworks with automated machine learning
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8280634/
https://www.ncbi.nlm.nih.gov/pubmed/34278102
http://dx.doi.org/10.1021/acsomega.0c05990
work_keys_str_mv AT yangpeisong acceleratingtheselectionofcovalentorganicframeworkswithautomatedmachinelearning
AT zhanghuan acceleratingtheselectionofcovalentorganicframeworkswithautomatedmachinelearning
AT laixin acceleratingtheselectionofcovalentorganicframeworkswithautomatedmachinelearning
AT wangkunfeng acceleratingtheselectionofcovalentorganicframeworkswithautomatedmachinelearning
AT yangqingyuan acceleratingtheselectionofcovalentorganicframeworkswithautomatedmachinelearning
AT yuduli acceleratingtheselectionofcovalentorganicframeworkswithautomatedmachinelearning