Cargando…

One class classification as a practical approach for accelerating π–π co-crystal discovery

The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We...

Descripción completa

Detalles Bibliográficos
Autores principales: Vriza, Aikaterini, Canaj, Angelos B., Vismara, Rebecca, Kershaw Cook, Laurence J., Manning, Troy D., Gaultois, Michael W., Wood, Peter A., Kurlin, Vitaliy, Berry, Neil, Dyer, Matthew S., Rosseinsky, Matthew J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8179233/
https://www.ncbi.nlm.nih.gov/pubmed/34163930
http://dx.doi.org/10.1039/d0sc04263c
_version_ 1783703734440689664
author Vriza, Aikaterini
Canaj, Angelos B.
Vismara, Rebecca
Kershaw Cook, Laurence J.
Manning, Troy D.
Gaultois, Michael W.
Wood, Peter A.
Kurlin, Vitaliy
Berry, Neil
Dyer, Matthew S.
Rosseinsky, Matthew J.
author_facet Vriza, Aikaterini
Canaj, Angelos B.
Vismara, Rebecca
Kershaw Cook, Laurence J.
Manning, Troy D.
Gaultois, Michael W.
Wood, Peter A.
Kurlin, Vitaliy
Berry, Neil
Dyer, Matthew S.
Rosseinsky, Matthew J.
author_sort Vriza, Aikaterini
collection PubMed
description The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6H-benzo[c]chromen-6-one (1) and pyrene-9,10-dicyanoanthracene (2).
format Online
Article
Text
id pubmed-8179233
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-81792332021-06-22 One class classification as a practical approach for accelerating π–π co-crystal discovery Vriza, Aikaterini Canaj, Angelos B. Vismara, Rebecca Kershaw Cook, Laurence J. Manning, Troy D. Gaultois, Michael W. Wood, Peter A. Kurlin, Vitaliy Berry, Neil Dyer, Matthew S. Rosseinsky, Matthew J. Chem Sci Chemistry The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6H-benzo[c]chromen-6-one (1) and pyrene-9,10-dicyanoanthracene (2). The Royal Society of Chemistry 2020-12-08 /pmc/articles/PMC8179233/ /pubmed/34163930 http://dx.doi.org/10.1039/d0sc04263c Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/
spellingShingle Chemistry
Vriza, Aikaterini
Canaj, Angelos B.
Vismara, Rebecca
Kershaw Cook, Laurence J.
Manning, Troy D.
Gaultois, Michael W.
Wood, Peter A.
Kurlin, Vitaliy
Berry, Neil
Dyer, Matthew S.
Rosseinsky, Matthew J.
One class classification as a practical approach for accelerating π–π co-crystal discovery
title One class classification as a practical approach for accelerating π–π co-crystal discovery
title_full One class classification as a practical approach for accelerating π–π co-crystal discovery
title_fullStr One class classification as a practical approach for accelerating π–π co-crystal discovery
title_full_unstemmed One class classification as a practical approach for accelerating π–π co-crystal discovery
title_short One class classification as a practical approach for accelerating π–π co-crystal discovery
title_sort one class classification as a practical approach for accelerating π–π co-crystal discovery
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8179233/
https://www.ncbi.nlm.nih.gov/pubmed/34163930
http://dx.doi.org/10.1039/d0sc04263c
work_keys_str_mv AT vrizaaikaterini oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT canajangelosb oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT vismararebecca oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT kershawcooklaurencej oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT manningtroyd oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT gaultoismichaelw oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT woodpetera oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT kurlinvitaliy oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT berryneil oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT dyermatthews oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery
AT rosseinskymatthewj oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery