Cargando…
Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence
Virtual screening has significantly improved the success rate of early stage drug discovery. Recent virtual screening methods have improved owing to advances in machine learning and chemical information. Among these advances, the creative extraction of drug features is important for predicting drug–...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531514/ https://www.ncbi.nlm.nih.gov/pubmed/36192818 http://dx.doi.org/10.1186/s13321-022-00644-1 |
_version_ | 1784801918310154240 |
---|---|
author | Ahn, Sangjin Lee, Si Eun Kim, Mi-hyun |
author_facet | Ahn, Sangjin Lee, Si Eun Kim, Mi-hyun |
author_sort | Ahn, Sangjin |
collection | PubMed |
description | Virtual screening has significantly improved the success rate of early stage drug discovery. Recent virtual screening methods have improved owing to advances in machine learning and chemical information. Among these advances, the creative extraction of drug features is important for predicting drug–target interaction (DTI), which is a large-scale virtual screening of known drugs. Herein, we report Kullbeck–Leibler divergence (KLD) as a DTI feature and the feature-driven classification model applicable to DTI prediction. For the purpose, E3FP three-dimensional (3D) molecular fingerprints of drugs as a molecular representation allow the computation of 3D similarities between ligands within each target (Q–Q matrix) to identify the uniqueness of pharmacological targets and those between a query and a ligand (Q–L vector) in DTIs. The 3D similarity matrices are transformed into probability density functions via kernel density estimation as a nonparametric estimation. Each density model can exploit the characteristics of each pharmacological target and measure the quasi-distance between the ligands. Furthermore, we developed a random forest model from the KLD feature vectors to successfully predict DTIs for representative 17 targets (mean accuracy: 0.882, out-of-bag score estimate: 0.876, ROC AUC: 0.990). The method is applicable for 2D chemical similarity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-022-00644-1. |
format | Online Article Text |
id | pubmed-9531514 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-95315142022-10-05 Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence Ahn, Sangjin Lee, Si Eun Kim, Mi-hyun J Cheminform Research Virtual screening has significantly improved the success rate of early stage drug discovery. Recent virtual screening methods have improved owing to advances in machine learning and chemical information. Among these advances, the creative extraction of drug features is important for predicting drug–target interaction (DTI), which is a large-scale virtual screening of known drugs. Herein, we report Kullbeck–Leibler divergence (KLD) as a DTI feature and the feature-driven classification model applicable to DTI prediction. For the purpose, E3FP three-dimensional (3D) molecular fingerprints of drugs as a molecular representation allow the computation of 3D similarities between ligands within each target (Q–Q matrix) to identify the uniqueness of pharmacological targets and those between a query and a ligand (Q–L vector) in DTIs. The 3D similarity matrices are transformed into probability density functions via kernel density estimation as a nonparametric estimation. Each density model can exploit the characteristics of each pharmacological target and measure the quasi-distance between the ligands. Furthermore, we developed a random forest model from the KLD feature vectors to successfully predict DTIs for representative 17 targets (mean accuracy: 0.882, out-of-bag score estimate: 0.876, ROC AUC: 0.990). The method is applicable for 2D chemical similarity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13321-022-00644-1. Springer International Publishing 2022-10-03 /pmc/articles/PMC9531514/ /pubmed/36192818 http://dx.doi.org/10.1186/s13321-022-00644-1 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Ahn, Sangjin Lee, Si Eun Kim, Mi-hyun Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence |
title | Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence |
title_full | Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence |
title_fullStr | Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence |
title_full_unstemmed | Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence |
title_short | Random-forest model for drug–target interaction prediction via Kullbeck–Leibler divergence |
title_sort | random-forest model for drug–target interaction prediction via kullbeck–leibler divergence |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9531514/ https://www.ncbi.nlm.nih.gov/pubmed/36192818 http://dx.doi.org/10.1186/s13321-022-00644-1 |
work_keys_str_mv | AT ahnsangjin randomforestmodelfordrugtargetinteractionpredictionviakullbeckleiblerdivergence AT leesieun randomforestmodelfordrugtargetinteractionpredictionviakullbeckleiblerdivergence AT kimmihyun randomforestmodelfordrugtargetinteractionpredictionviakullbeckleiblerdivergence |