Cargando…
CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery
[Image: see text] The discovery of new drug candidates to inhibit an intended target is a complex and resource-consuming process. A machine learning (ML) method for predicting drug–target interactions (DTI) is a potential solution to improve the efficiency. However, traditional ML approaches have li...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2023
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10099439/ https://www.ncbi.nlm.nih.gov/pubmed/37065046 http://dx.doi.org/10.1021/acsomega.3c00160 |
_version_ | 1785025053472063488 |
---|---|
author | Bu, Yingzi Gao, Ruoxi Zhang, Bohan Zhang, Luchen Sun, Duxin |
author_facet | Bu, Yingzi Gao, Ruoxi Zhang, Bohan Zhang, Luchen Sun, Duxin |
author_sort | Bu, Yingzi |
collection | PubMed |
description | [Image: see text] The discovery of new drug candidates to inhibit an intended target is a complex and resource-consuming process. A machine learning (ML) method for predicting drug–target interactions (DTI) is a potential solution to improve the efficiency. However, traditional ML approaches have limitations in accuracy. In this study, we developed a novel ensemble model CoGT for DTI prediction using multilayer perceptron (MLP), which integrated graph-based models to extract non-Euclidean molecular structures and large pretrained models, specifically chemBERTa, to process simplified molecular input line entry systems (SMILES). The performance of CoGT was evaluated using compounds inhibiting four Janus kinases (JAKs). Results showed that the large pretrained model, chemBERTa, was better than other conventional ML models in predicting DTI across multiple evaluation metrics, while the graph neural network (GNN) was effective for prediction on imbalanced data sets. To take full advantage of the strengths of these different models, we developed an ensemble model, CoGT, which outperformed other individual ML models in predicting compounds’ inhibition on different isoforms of JAKs. Our data suggest that the ensemble model CoGT has the potential to accelerate the process of drug discovery. |
format | Online Article Text |
id | pubmed-10099439 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-100994392023-04-14 CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery Bu, Yingzi Gao, Ruoxi Zhang, Bohan Zhang, Luchen Sun, Duxin ACS Omega [Image: see text] The discovery of new drug candidates to inhibit an intended target is a complex and resource-consuming process. A machine learning (ML) method for predicting drug–target interactions (DTI) is a potential solution to improve the efficiency. However, traditional ML approaches have limitations in accuracy. In this study, we developed a novel ensemble model CoGT for DTI prediction using multilayer perceptron (MLP), which integrated graph-based models to extract non-Euclidean molecular structures and large pretrained models, specifically chemBERTa, to process simplified molecular input line entry systems (SMILES). The performance of CoGT was evaluated using compounds inhibiting four Janus kinases (JAKs). Results showed that the large pretrained model, chemBERTa, was better than other conventional ML models in predicting DTI across multiple evaluation metrics, while the graph neural network (GNN) was effective for prediction on imbalanced data sets. To take full advantage of the strengths of these different models, we developed an ensemble model, CoGT, which outperformed other individual ML models in predicting compounds’ inhibition on different isoforms of JAKs. Our data suggest that the ensemble model CoGT has the potential to accelerate the process of drug discovery. American Chemical Society 2023-03-27 /pmc/articles/PMC10099439/ /pubmed/37065046 http://dx.doi.org/10.1021/acsomega.3c00160 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Bu, Yingzi Gao, Ruoxi Zhang, Bohan Zhang, Luchen Sun, Duxin CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery |
title | CoGT: Ensemble
Machine Learning Method and Its Application
on JAK Inhibitor Discovery |
title_full | CoGT: Ensemble
Machine Learning Method and Its Application
on JAK Inhibitor Discovery |
title_fullStr | CoGT: Ensemble
Machine Learning Method and Its Application
on JAK Inhibitor Discovery |
title_full_unstemmed | CoGT: Ensemble
Machine Learning Method and Its Application
on JAK Inhibitor Discovery |
title_short | CoGT: Ensemble
Machine Learning Method and Its Application
on JAK Inhibitor Discovery |
title_sort | cogt: ensemble
machine learning method and its application
on jak inhibitor discovery |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10099439/ https://www.ncbi.nlm.nih.gov/pubmed/37065046 http://dx.doi.org/10.1021/acsomega.3c00160 |
work_keys_str_mv | AT buyingzi cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery AT gaoruoxi cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery AT zhangbohan cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery AT zhangluchen cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery AT sunduxin cogtensemblemachinelearningmethodanditsapplicationonjakinhibitordiscovery |