Cargando…
One class classification as a practical approach for accelerating π–π co-crystal discovery
The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society of Chemistry
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8179233/ https://www.ncbi.nlm.nih.gov/pubmed/34163930 http://dx.doi.org/10.1039/d0sc04263c |
_version_ | 1783703734440689664 |
---|---|
author | Vriza, Aikaterini Canaj, Angelos B. Vismara, Rebecca Kershaw Cook, Laurence J. Manning, Troy D. Gaultois, Michael W. Wood, Peter A. Kurlin, Vitaliy Berry, Neil Dyer, Matthew S. Rosseinsky, Matthew J. |
author_facet | Vriza, Aikaterini Canaj, Angelos B. Vismara, Rebecca Kershaw Cook, Laurence J. Manning, Troy D. Gaultois, Michael W. Wood, Peter A. Kurlin, Vitaliy Berry, Neil Dyer, Matthew S. Rosseinsky, Matthew J. |
author_sort | Vriza, Aikaterini |
collection | PubMed |
description | The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6H-benzo[c]chromen-6-one (1) and pyrene-9,10-dicyanoanthracene (2). |
format | Online Article Text |
id | pubmed-8179233 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | The Royal Society of Chemistry |
record_format | MEDLINE/PubMed |
spelling | pubmed-81792332021-06-22 One class classification as a practical approach for accelerating π–π co-crystal discovery Vriza, Aikaterini Canaj, Angelos B. Vismara, Rebecca Kershaw Cook, Laurence J. Manning, Troy D. Gaultois, Michael W. Wood, Peter A. Kurlin, Vitaliy Berry, Neil Dyer, Matthew S. Rosseinsky, Matthew J. Chem Sci Chemistry The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6H-benzo[c]chromen-6-one (1) and pyrene-9,10-dicyanoanthracene (2). The Royal Society of Chemistry 2020-12-08 /pmc/articles/PMC8179233/ /pubmed/34163930 http://dx.doi.org/10.1039/d0sc04263c Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by/3.0/ |
spellingShingle | Chemistry Vriza, Aikaterini Canaj, Angelos B. Vismara, Rebecca Kershaw Cook, Laurence J. Manning, Troy D. Gaultois, Michael W. Wood, Peter A. Kurlin, Vitaliy Berry, Neil Dyer, Matthew S. Rosseinsky, Matthew J. One class classification as a practical approach for accelerating π–π co-crystal discovery |
title | One class classification as a practical approach for accelerating π–π co-crystal discovery |
title_full | One class classification as a practical approach for accelerating π–π co-crystal discovery |
title_fullStr | One class classification as a practical approach for accelerating π–π co-crystal discovery |
title_full_unstemmed | One class classification as a practical approach for accelerating π–π co-crystal discovery |
title_short | One class classification as a practical approach for accelerating π–π co-crystal discovery |
title_sort | one class classification as a practical approach for accelerating π–π co-crystal discovery |
topic | Chemistry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8179233/ https://www.ncbi.nlm.nih.gov/pubmed/34163930 http://dx.doi.org/10.1039/d0sc04263c |
work_keys_str_mv | AT vrizaaikaterini oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT canajangelosb oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT vismararebecca oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT kershawcooklaurencej oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT manningtroyd oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT gaultoismichaelw oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT woodpetera oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT kurlinvitaliy oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT berryneil oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT dyermatthews oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery AT rosseinskymatthewj oneclassclassificationasapracticalapproachforacceleratingppcocrystaldiscovery |