Cargando…
Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions
Machine learning has revolutionized information processing for large datasets across various fields. However, its limited interpretability poses a significant challenge when applied to chemistry. In this study, we developed a set of simple molecular representations to capture the structural informat...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10302643/ https://www.ncbi.nlm.nih.gov/pubmed/37375286 http://dx.doi.org/10.3390/molecules28124730 |
_version_ | 1785065091889102848 |
---|---|
author | Chan, Kalok Ta, Long Thanh Huang, Yong Su, Haibin Lin, Zhenyang |
author_facet | Chan, Kalok Ta, Long Thanh Huang, Yong Su, Haibin Lin, Zhenyang |
author_sort | Chan, Kalok |
collection | PubMed |
description | Machine learning has revolutionized information processing for large datasets across various fields. However, its limited interpretability poses a significant challenge when applied to chemistry. In this study, we developed a set of simple molecular representations to capture the structural information of ligands in palladium-catalyzed Sonogashira coupling reactions of aryl bromides. Drawing inspiration from human understanding of catalytic cycles, we used a graph neural network to extract structural details of the phosphine ligand, a major contributor to the overall activation energy. We combined these simple molecular representations with an electronic descriptor of aryl bromide as inputs for a fully connected neural network unit. The results allowed us to predict rate constants and gain mechanistic insights into the rate-limiting oxidative addition process using a relatively small dataset. This study highlights the importance of incorporating domain knowledge in machine learning and presents an alternative approach to data analysis. |
format | Online Article Text |
id | pubmed-10302643 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-103026432023-06-29 Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions Chan, Kalok Ta, Long Thanh Huang, Yong Su, Haibin Lin, Zhenyang Molecules Article Machine learning has revolutionized information processing for large datasets across various fields. However, its limited interpretability poses a significant challenge when applied to chemistry. In this study, we developed a set of simple molecular representations to capture the structural information of ligands in palladium-catalyzed Sonogashira coupling reactions of aryl bromides. Drawing inspiration from human understanding of catalytic cycles, we used a graph neural network to extract structural details of the phosphine ligand, a major contributor to the overall activation energy. We combined these simple molecular representations with an electronic descriptor of aryl bromide as inputs for a fully connected neural network unit. The results allowed us to predict rate constants and gain mechanistic insights into the rate-limiting oxidative addition process using a relatively small dataset. This study highlights the importance of incorporating domain knowledge in machine learning and presents an alternative approach to data analysis. MDPI 2023-06-13 /pmc/articles/PMC10302643/ /pubmed/37375286 http://dx.doi.org/10.3390/molecules28124730 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Chan, Kalok Ta, Long Thanh Huang, Yong Su, Haibin Lin, Zhenyang Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions |
title | Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions |
title_full | Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions |
title_fullStr | Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions |
title_full_unstemmed | Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions |
title_short | Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions |
title_sort | incorporating domain knowledge and structure-based descriptors for machine learning: a case study of pd-catalyzed sonogashira reactions |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10302643/ https://www.ncbi.nlm.nih.gov/pubmed/37375286 http://dx.doi.org/10.3390/molecules28124730 |
work_keys_str_mv | AT chankalok incorporatingdomainknowledgeandstructurebaseddescriptorsformachinelearningacasestudyofpdcatalyzedsonogashirareactions AT talongthanh incorporatingdomainknowledgeandstructurebaseddescriptorsformachinelearningacasestudyofpdcatalyzedsonogashirareactions AT huangyong incorporatingdomainknowledgeandstructurebaseddescriptorsformachinelearningacasestudyofpdcatalyzedsonogashirareactions AT suhaibin incorporatingdomainknowledgeandstructurebaseddescriptorsformachinelearningacasestudyofpdcatalyzedsonogashirareactions AT linzhenyang incorporatingdomainknowledgeandstructurebaseddescriptorsformachinelearningacasestudyofpdcatalyzedsonogashirareactions |