Cargando…

Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms

Cyclooxygenase-2 (COX-2) and microsomal prostaglandin E(2) synthase (mPGES-1) are two key targets in anti-inflammatory therapy. Medicine and food homology (MFH) substances have both edible and medicinal properties, providing a valuable resource for the development of novel, safe, and efficient COX-2...

Descripción completa

Detalles Bibliográficos
Autores principales: Tian, Yujia, Zhang, Zhixing, Yan, Aixia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10574661/
https://www.ncbi.nlm.nih.gov/pubmed/37836625
http://dx.doi.org/10.3390/molecules28196782
_version_ 1785120744795013120
author Tian, Yujia
Zhang, Zhixing
Yan, Aixia
author_facet Tian, Yujia
Zhang, Zhixing
Yan, Aixia
author_sort Tian, Yujia
collection PubMed
description Cyclooxygenase-2 (COX-2) and microsomal prostaglandin E(2) synthase (mPGES-1) are two key targets in anti-inflammatory therapy. Medicine and food homology (MFH) substances have both edible and medicinal properties, providing a valuable resource for the development of novel, safe, and efficient COX-2 and mPGES-1 inhibitors. In this study, we collected active ingredients from 503 MFH substances and constructed the first comprehensive MFH database containing 27,319 molecules. Subsequently, we performed Murcko scaffold analysis and K-means clustering to deeply analyze the composition of the constructed database and evaluate its structural diversity. Furthermore, we employed four supervised machine learning algorithms, including support vector machine (SVM), random forest (RF), deep neural networks (DNNs), and eXtreme Gradient Boosting (XGBoost), as well as ensemble learning, to establish 640 classification models and 160 regression models for COX-2 and mPGES-1 inhibitors. Among them, ModelA_ensemble_RF_1 emerged as the optimal classification model for COX-2 inhibitors, achieving predicted Matthews correlation coefficient (MCC) values of 0.802 and 0.603 on the test set and external validation set, respectively. ModelC_RDKIT_SVM_2 was identified as the best regression model based on COX-2 inhibitors, with root mean squared error (RMSE) values of 0.419 and 0.513 on the test set and external validation set, respectively. ModelD_ECFP_SVM_4 stood out as the top classification model for mPGES-1 inhibitors, attaining MCC values of 0.832 and 0.584 on the test set and external validation set, respectively. The optimal regression model for mPGES-1 inhibitors, ModelF_3D_SVM_1, exhibited predictive RMSE values of 0.253 and 0.35 on the test set and external validation set, respectively. Finally, we proposed a ligand-based cascade virtual screening strategy, which integrated the well-performing supervised machine learning models with unsupervised learning: the self-organized map (SOM) and molecular scaffold analysis. Using this virtual screening workflow, we discovered 10 potential COX-2 inhibitors and 15 potential mPGES-1 inhibitors from the MFH database. We further verified candidates by molecular docking, investigated the interaction of the candidate molecules upon binding to COX-2 or mPGES-1. The constructed comprehensive MFH database has laid a solid foundation for the further research and utilization of the MFH substances. The series of well-performing machine learning models can be employed to predict the COX-2 and mPGES-1 inhibitory capabilities of unknown compounds, thereby aiding in the discovery of anti-inflammatory medications. The COX-2 and mPGES-1 potential inhibitor molecules identified through the cascade virtual screening approach provide insights and references for the design of highly effective and safe novel anti-inflammatory drugs.
format Online
Article
Text
id pubmed-10574661
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105746612023-10-14 Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms Tian, Yujia Zhang, Zhixing Yan, Aixia Molecules Article Cyclooxygenase-2 (COX-2) and microsomal prostaglandin E(2) synthase (mPGES-1) are two key targets in anti-inflammatory therapy. Medicine and food homology (MFH) substances have both edible and medicinal properties, providing a valuable resource for the development of novel, safe, and efficient COX-2 and mPGES-1 inhibitors. In this study, we collected active ingredients from 503 MFH substances and constructed the first comprehensive MFH database containing 27,319 molecules. Subsequently, we performed Murcko scaffold analysis and K-means clustering to deeply analyze the composition of the constructed database and evaluate its structural diversity. Furthermore, we employed four supervised machine learning algorithms, including support vector machine (SVM), random forest (RF), deep neural networks (DNNs), and eXtreme Gradient Boosting (XGBoost), as well as ensemble learning, to establish 640 classification models and 160 regression models for COX-2 and mPGES-1 inhibitors. Among them, ModelA_ensemble_RF_1 emerged as the optimal classification model for COX-2 inhibitors, achieving predicted Matthews correlation coefficient (MCC) values of 0.802 and 0.603 on the test set and external validation set, respectively. ModelC_RDKIT_SVM_2 was identified as the best regression model based on COX-2 inhibitors, with root mean squared error (RMSE) values of 0.419 and 0.513 on the test set and external validation set, respectively. ModelD_ECFP_SVM_4 stood out as the top classification model for mPGES-1 inhibitors, attaining MCC values of 0.832 and 0.584 on the test set and external validation set, respectively. The optimal regression model for mPGES-1 inhibitors, ModelF_3D_SVM_1, exhibited predictive RMSE values of 0.253 and 0.35 on the test set and external validation set, respectively. Finally, we proposed a ligand-based cascade virtual screening strategy, which integrated the well-performing supervised machine learning models with unsupervised learning: the self-organized map (SOM) and molecular scaffold analysis. Using this virtual screening workflow, we discovered 10 potential COX-2 inhibitors and 15 potential mPGES-1 inhibitors from the MFH database. We further verified candidates by molecular docking, investigated the interaction of the candidate molecules upon binding to COX-2 or mPGES-1. The constructed comprehensive MFH database has laid a solid foundation for the further research and utilization of the MFH substances. The series of well-performing machine learning models can be employed to predict the COX-2 and mPGES-1 inhibitory capabilities of unknown compounds, thereby aiding in the discovery of anti-inflammatory medications. The COX-2 and mPGES-1 potential inhibitor molecules identified through the cascade virtual screening approach provide insights and references for the design of highly effective and safe novel anti-inflammatory drugs. MDPI 2023-09-23 /pmc/articles/PMC10574661/ /pubmed/37836625 http://dx.doi.org/10.3390/molecules28196782 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Tian, Yujia
Zhang, Zhixing
Yan, Aixia
Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms
title Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms
title_full Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms
title_fullStr Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms
title_full_unstemmed Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms
title_short Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms
title_sort discovering the active ingredients of medicine and food homologous substances for inhibiting the cyclooxygenase-2 metabolic pathway by machine learning algorithms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10574661/
https://www.ncbi.nlm.nih.gov/pubmed/37836625
http://dx.doi.org/10.3390/molecules28196782
work_keys_str_mv AT tianyujia discoveringtheactiveingredientsofmedicineandfoodhomologoussubstancesforinhibitingthecyclooxygenase2metabolicpathwaybymachinelearningalgorithms
AT zhangzhixing discoveringtheactiveingredientsofmedicineandfoodhomologoussubstancesforinhibitingthecyclooxygenase2metabolicpathwaybymachinelearningalgorithms
AT yanaixia discoveringtheactiveingredientsofmedicineandfoodhomologoussubstancesforinhibitingthecyclooxygenase2metabolicpathwaybymachinelearningalgorithms