Cargando…

Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence

Virtual screening has significantly improved the success rate of early stage drug discovery. Recent virtual screening methods have improved owing to advances in machine learning and chemical information. Among these advances, the creative extraction of drug features is important for predicting drug–...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahn, Sangjin, Lee, Si Eun, Kim, Mi-hyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531514/
https://www.ncbi.nlm.nih.gov/pubmed/36192818
http://dx.doi.org/10.1186/s13321-022-00644-1
_version_ 1784801918310154240
author Ahn, Sangjin
Lee, Si Eun
Kim, Mi-hyun
author_facet Ahn, Sangjin
Lee, Si Eun
Kim, Mi-hyun
author_sort Ahn, Sangjin
collection PubMed
description Virtual screening has significantly improved the success rate of early stage drug discovery. Recent virtual screening methods have improved owing to advances in machine learning and chemical information. Among these advances, the creative extraction of drug features is important for predicting drug–target interaction (DTI), which is a large-scale virtual screening of known drugs. Herein, we report Kullbeck–Leibler divergence (KLD) as a DTI feature and the feature-driven classification model applicable to DTI prediction. For the purpose, E3FP three-dimensional (3D) molecular fingerprints of drugs as a molecular representation allow the computation of 3D similarities between ligands within each target (Q–Q matrix) to identify the uniqueness of pharmacological targets and those between a query and a ligand (Q–L vector) in DTIs. The 3D similarity matrices are transformed into probability density functions via kernel density estimation as a nonparametric estimation. Each density model can exploit the characteristics of each pharmacological target and measure the quasi-distance between the ligands. Furthermore, we developed a random forest model from the KLD feature vectors to successfully predict DTIs for representative 17 targets (mean accuracy: 0.882, out-of-bag score estimate: 0.876, ROC AUC: 0.990). The method is applicable for 2D chemical similarity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-022-00644-1.
format Online
Article
Text
id pubmed-9531514
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-95315142022-10-05 Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence Ahn, Sangjin Lee, Si Eun Kim, Mi-hyun J Cheminform Research Virtual screening has significantly improved the success rate of early stage drug discovery. Recent virtual screening methods have improved owing to advances in machine learning and chemical information. Among these advances, the creative extraction of drug features is important for predicting drug–target interaction (DTI), which is a large-scale virtual screening of known drugs. Herein, we report Kullbeck–Leibler divergence (KLD) as a DTI feature and the feature-driven classification model applicable to DTI prediction. For the purpose, E3FP three-dimensional (3D) molecular fingerprints of drugs as a molecular representation allow the computation of 3D similarities between ligands within each target (Q–Q matrix) to identify the uniqueness of pharmacological targets and those between a query and a ligand (Q–L vector) in DTIs. The 3D similarity matrices are transformed into probability density functions via kernel density estimation as a nonparametric estimation. Each density model can exploit the characteristics of each pharmacological target and measure the quasi-distance between the ligands. Furthermore, we developed a random forest model from the KLD feature vectors to successfully predict DTIs for representative 17 targets (mean accuracy: 0.882, out-of-bag score estimate: 0.876, ROC AUC: 0.990). The method is applicable for 2D chemical similarity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-022-00644-1. Springer International Publishing 2022-10-03 /pmc/articles/PMC9531514/ /pubmed/36192818 http://dx.doi.org/10.1186/s13321-022-00644-1 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Ahn, Sangjin
Lee, Si Eun
Kim, Mi-hyun
Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence
title Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence
title_full Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence
title_fullStr Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence
title_full_unstemmed Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence
title_short Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence
title_sort random-forest model for drug–target interaction prediction via kullbeck–leibler divergence
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531514/
https://www.ncbi.nlm.nih.gov/pubmed/36192818
http://dx.doi.org/10.1186/s13321-022-00644-1
work_keys_str_mv AT ahnsangjin randomforestmodelfordrugtargetinteractionpredictionviakullbeckleiblerdivergence
AT leesieun randomforestmodelfordrugtargetinteractionpredictionviakullbeckleiblerdivergence
AT kimmihyun randomforestmodelfordrugtargetinteractionpredictionviakullbeckleiblerdivergence