Cargando…

Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis

As rationally designable materials, the variety and number of synthesised metal–organic cages (MOCs) and organic cages (OCs) are expected to grow in the Cambridge Structural Database (CSD). In this regard, two of the most important questions are, which structures are already present in the CSD and h...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Aurelia, Bueno-Perez, Rocio, Fairen-Jimenez, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9682994/
https://www.ncbi.nlm.nih.gov/pubmed/36507160
http://dx.doi.org/10.1039/d2sc03171j
_version_ 1784834978240004096
author Li, Aurelia
Bueno-Perez, Rocio
Fairen-Jimenez, David
author_facet Li, Aurelia
Bueno-Perez, Rocio
Fairen-Jimenez, David
author_sort Li, Aurelia
collection PubMed
description As rationally designable materials, the variety and number of synthesised metal–organic cages (MOCs) and organic cages (OCs) are expected to grow in the Cambridge Structural Database (CSD). In this regard, two of the most important questions are, which structures are already present in the CSD and how can they be identified? Here, we present a cage mining methodology based on topological data analysis and a combination of supervised and unsupervised learning that led to the derivation of – to the best of our knowledge – the first and only MOC dataset of 1839 structures and the largest experimental OC dataset of 7736 cages, as of March 2022. We illustrate the use of such datasets with a high-throughput screening of MOCs and OCs for xenon/krypton separation, important gases in multiple industries, including healthcare.
format Online
Article
Text
id pubmed-9682994
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-96829942022-12-08 Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis Li, Aurelia Bueno-Perez, Rocio Fairen-Jimenez, David Chem Sci Chemistry As rationally designable materials, the variety and number of synthesised metal–organic cages (MOCs) and organic cages (OCs) are expected to grow in the Cambridge Structural Database (CSD). In this regard, two of the most important questions are, which structures are already present in the CSD and how can they be identified? Here, we present a cage mining methodology based on topological data analysis and a combination of supervised and unsupervised learning that led to the derivation of – to the best of our knowledge – the first and only MOC dataset of 1839 structures and the largest experimental OC dataset of 7736 cages, as of March 2022. We illustrate the use of such datasets with a high-throughput screening of MOCs and OCs for xenon/krypton separation, important gases in multiple industries, including healthcare. The Royal Society of Chemistry 2022-10-31 /pmc/articles/PMC9682994/ /pubmed/36507160 http://dx.doi.org/10.1039/d2sc03171j Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/
spellingShingle Chemistry
Li, Aurelia
Bueno-Perez, Rocio
Fairen-Jimenez, David
Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis
title Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis
title_full Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis
title_fullStr Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis
title_full_unstemmed Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis
title_short Identifying porous cage subsets in the Cambridge Structural Database using topological data analysis
title_sort identifying porous cage subsets in the cambridge structural database using topological data analysis
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9682994/
https://www.ncbi.nlm.nih.gov/pubmed/36507160
http://dx.doi.org/10.1039/d2sc03171j
work_keys_str_mv AT liaurelia identifyingporouscagesubsetsinthecambridgestructuraldatabaseusingtopologicaldataanalysis
AT buenoperezrocio identifyingporouscagesubsetsinthecambridgestructuraldatabaseusingtopologicaldataanalysis
AT fairenjimenezdavid identifyingporouscagesubsetsinthecambridgestructuraldatabaseusingtopologicaldataanalysis