Cargando…

CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery

[Image: see text] The discovery of new drug candidates to inhibit an intended target is a complex and resource-consuming process. A machine learning (ML) method for predicting drug–target interactions (DTI) is a potential solution to improve the efficiency. However, traditional ML approaches have li...

Descripción completa

Detalles Bibliográficos
Autores principales: Bu, Yingzi, Gao, Ruoxi, Zhang, Bohan, Zhang, Luchen, Sun, Duxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10099439/
https://www.ncbi.nlm.nih.gov/pubmed/37065046
http://dx.doi.org/10.1021/acsomega.3c00160
_version_ 1785025053472063488
author Bu, Yingzi
Gao, Ruoxi
Zhang, Bohan
Zhang, Luchen
Sun, Duxin
author_facet Bu, Yingzi
Gao, Ruoxi
Zhang, Bohan
Zhang, Luchen
Sun, Duxin
author_sort Bu, Yingzi
collection PubMed
description [Image: see text] The discovery of new drug candidates to inhibit an intended target is a complex and resource-consuming process. A machine learning (ML) method for predicting drug–target interactions (DTI) is a potential solution to improve the efficiency. However, traditional ML approaches have limitations in accuracy. In this study, we developed a novel ensemble model CoGT for DTI prediction using multilayer perceptron (MLP), which integrated graph-based models to extract non-Euclidean molecular structures and large pretrained models, specifically chemBERTa, to process simplified molecular input line entry systems (SMILES). The performance of CoGT was evaluated using compounds inhibiting four Janus kinases (JAKs). Results showed that the large pretrained model, chemBERTa, was better than other conventional ML models in predicting DTI across multiple evaluation metrics, while the graph neural network (GNN) was effective for prediction on imbalanced data sets. To take full advantage of the strengths of these different models, we developed an ensemble model, CoGT, which outperformed other individual ML models in predicting compounds’ inhibition on different isoforms of JAKs. Our data suggest that the ensemble model CoGT has the potential to accelerate the process of drug discovery.
format Online
Article
Text
id pubmed-10099439
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-100994392023-04-14 CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery Bu, Yingzi Gao, Ruoxi Zhang, Bohan Zhang, Luchen Sun, Duxin ACS Omega [Image: see text] The discovery of new drug candidates to inhibit an intended target is a complex and resource-consuming process. A machine learning (ML) method for predicting drug–target interactions (DTI) is a potential solution to improve the efficiency. However, traditional ML approaches have limitations in accuracy. In this study, we developed a novel ensemble model CoGT for DTI prediction using multilayer perceptron (MLP), which integrated graph-based models to extract non-Euclidean molecular structures and large pretrained models, specifically chemBERTa, to process simplified molecular input line entry systems (SMILES). The performance of CoGT was evaluated using compounds inhibiting four Janus kinases (JAKs). Results showed that the large pretrained model, chemBERTa, was better than other conventional ML models in predicting DTI across multiple evaluation metrics, while the graph neural network (GNN) was effective for prediction on imbalanced data sets. To take full advantage of the strengths of these different models, we developed an ensemble model, CoGT, which outperformed other individual ML models in predicting compounds’ inhibition on different isoforms of JAKs. Our data suggest that the ensemble model CoGT has the potential to accelerate the process of drug discovery. American Chemical Society 2023-03-27 /pmc/articles/PMC10099439/ /pubmed/37065046 http://dx.doi.org/10.1021/acsomega.3c00160 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Bu, Yingzi
Gao, Ruoxi
Zhang, Bohan
Zhang, Luchen
Sun, Duxin
CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery
title CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery
title_full CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery
title_fullStr CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery
title_full_unstemmed CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery
title_short CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery
title_sort cogt: ensemble machine learning method and its application on jak inhibitor discovery
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10099439/
https://www.ncbi.nlm.nih.gov/pubmed/37065046
http://dx.doi.org/10.1021/acsomega.3c00160
work_keys_str_mv AT buyingzi cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery
AT gaoruoxi cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery
AT zhangbohan cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery
AT zhangluchen cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery
AT sunduxin cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery