Cargando…

Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions

Machine learning has revolutionized information processing for large datasets across various fields. However, its limited interpretability poses a significant challenge when applied to chemistry. In this study, we developed a set of simple molecular representations to capture the structural informat...

Descripción completa

Detalles Bibliográficos
Autores principales: Chan, Kalok, Ta, Long Thanh, Huang, Yong, Su, Haibin, Lin, Zhenyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10302643/
https://www.ncbi.nlm.nih.gov/pubmed/37375286
http://dx.doi.org/10.3390/molecules28124730
Descripción
Sumario:Machine learning has revolutionized information processing for large datasets across various fields. However, its limited interpretability poses a significant challenge when applied to chemistry. In this study, we developed a set of simple molecular representations to capture the structural information of ligands in palladium-catalyzed Sonogashira coupling reactions of aryl bromides. Drawing inspiration from human understanding of catalytic cycles, we used a graph neural network to extract structural details of the phosphine ligand, a major contributor to the overall activation energy. We combined these simple molecular representations with an electronic descriptor of aryl bromide as inputs for a fully connected neural network unit. The results allowed us to predict rate constants and gain mechanistic insights into the rate-limiting oxidative addition process using a relatively small dataset. This study highlights the importance of incorporating domain knowledge in machine learning and presents an alternative approach to data analysis.