Cargando…

CoGT: Ensemble Machine Learning Method and Its Application on JAK Inhibitor Discovery

[Image: see text] The discovery of new drug candidates to inhibit an intended target is a complex and resource-consuming process. A machine learning (ML) method for predicting drug–target interactions (DTI) is a potential solution to improve the efficiency. However, traditional ML approaches have li...

Descripción completa

Detalles Bibliográficos
Autores principales: Bu, Yingzi, Gao, Ruoxi, Zhang, Bohan, Zhang, Luchen, Sun, Duxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10099439/
https://www.ncbi.nlm.nih.gov/pubmed/37065046
http://dx.doi.org/10.1021/acsomega.3c00160
Descripción
Sumario:[Image: see text] The discovery of new drug candidates to inhibit an intended target is a complex and resource-consuming process. A machine learning (ML) method for predicting drug–target interactions (DTI) is a potential solution to improve the efficiency. However, traditional ML approaches have limitations in accuracy. In this study, we developed a novel ensemble model CoGT for DTI prediction using multilayer perceptron (MLP), which integrated graph-based models to extract non-Euclidean molecular structures and large pretrained models, specifically chemBERTa, to process simplified molecular input line entry systems (SMILES). The performance of CoGT was evaluated using compounds inhibiting four Janus kinases (JAKs). Results showed that the large pretrained model, chemBERTa, was better than other conventional ML models in predicting DTI across multiple evaluation metrics, while the graph neural network (GNN) was effective for prediction on imbalanced data sets. To take full advantage of the strengths of these different models, we developed an ensemble model, CoGT, which outperformed other individual ML models in predicting compounds’ inhibition on different isoforms of JAKs. Our data suggest that the ensemble model CoGT has the potential to accelerate the process of drug discovery.