Cargando…
Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis
As rationally designable materials, the variety and number of synthesised metal–organic cages (MOCs) and organic cages (OCs) are expected to grow in the Cambridge Structural Database (CSD). In this regard, two of the most important questions are, which structures are already present in the CSD and h...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society of Chemistry
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9682994/ https://www.ncbi.nlm.nih.gov/pubmed/36507160 http://dx.doi.org/10.1039/d2sc03171j |
_version_ | 1784834978240004096 |
---|---|
author | Li, Aurelia Bueno-Perez, Rocio Fairen-Jimenez, David |
author_facet | Li, Aurelia Bueno-Perez, Rocio Fairen-Jimenez, David |
author_sort | Li, Aurelia |
collection | PubMed |
description | As rationally designable materials, the variety and number of synthesised metal–organic cages (MOCs) and organic cages (OCs) are expected to grow in the Cambridge Structural Database (CSD). In this regard, two of the most important questions are, which structures are already present in the CSD and how can they be identified? Here, we present a cage mining methodology based on topological data analysis and a combination of supervised and unsupervised learning that led to the derivation of – to the best of our knowledge – the first and only MOC dataset of 1839 structures and the largest experimental OC dataset of 7736 cages, as of March 2022. We illustrate the use of such datasets with a high-throughput screening of MOCs and OCs for xenon/krypton separation, important gases in multiple industries, including healthcare. |
format | Online Article Text |
id | pubmed-9682994 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | The Royal Society of Chemistry |
record_format | MEDLINE/PubMed |
spelling | pubmed-96829942022-12-08 Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis Li, Aurelia Bueno-Perez, Rocio Fairen-Jimenez, David Chem Sci Chemistry As rationally designable materials, the variety and number of synthesised metal–organic cages (MOCs) and organic cages (OCs) are expected to grow in the Cambridge Structural Database (CSD). In this regard, two of the most important questions are, which structures are already present in the CSD and how can they be identified? Here, we present a cage mining methodology based on topological data analysis and a combination of supervised and unsupervised learning that led to the derivation of – to the best of our knowledge – the first and only MOC dataset of 1839 structures and the largest experimental OC dataset of 7736 cages, as of March 2022. We illustrate the use of such datasets with a high-throughput screening of MOCs and OCs for xenon/krypton separation, important gases in multiple industries, including healthcare. The Royal Society of Chemistry 2022-10-31 /pmc/articles/PMC9682994/ /pubmed/36507160 http://dx.doi.org/10.1039/d2sc03171j Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/ |
spellingShingle | Chemistry Li, Aurelia Bueno-Perez, Rocio Fairen-Jimenez, David Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis |
title | Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis |
title_full | Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis |
title_fullStr | Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis |
title_full_unstemmed | Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis |
title_short | Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis |
title_sort | identifying porous cage subsets in the cambridge structural database using topological data analysis |
topic | Chemistry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9682994/ https://www.ncbi.nlm.nih.gov/pubmed/36507160 http://dx.doi.org/10.1039/d2sc03171j |
work_keys_str_mv | AT liaurelia identifyingporouscagesubsetsinthecambridgestructuraldatabaseusingtopologicaldataanalysis AT buenoperezrocio identifyingporouscagesubsetsinthecambridgestructuraldatabaseusingtopologicaldataanalysis AT fairenjimenezdavid identifyingporouscagesubsetsinthecambridgestructuraldatabaseusingtopologicaldataanalysis |