Cargando…

MOST: most-similar ligand based approach to target prediction

BACKGROUND: Many computational approaches have been used for target prediction, including machine learning, reverse docking, bioactivity spectra analysis, and chemical similarity searching. Recent studies have suggested that chemical similarity searching may be driven by the most-similar ligand. How...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Tao, Mi, Hong, Lin, Cheng-yuan, Zhao, Ling, Zhong, Linda L. D., Liu, Feng-bin, Zhang, Ge, Lu, Ai-ping, Bian, Zhao-xiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5346209/
https://www.ncbi.nlm.nih.gov/pubmed/28284192
http://dx.doi.org/10.1186/s12859-017-1586-z
_version_ 1782513844623507456
author Huang, Tao
Mi, Hong
Lin, Cheng-yuan
Zhao, Ling
Zhong, Linda L. D.
Liu, Feng-bin
Zhang, Ge
Lu, Ai-ping
Bian, Zhao-xiang
author_facet Huang, Tao
Mi, Hong
Lin, Cheng-yuan
Zhao, Ling
Zhong, Linda L. D.
Liu, Feng-bin
Zhang, Ge
Lu, Ai-ping
Bian, Zhao-xiang
author_sort Huang, Tao
collection PubMed
description BACKGROUND: Many computational approaches have been used for target prediction, including machine learning, reverse docking, bioactivity spectra analysis, and chemical similarity searching. Recent studies have suggested that chemical similarity searching may be driven by the most-similar ligand. However, the extent of bioactivity of most-similar ligands has been oversimplified or even neglected in these studies, and this has impaired the prediction power. RESULTS: Here we propose the MOst-Similar ligand-based Target inference approach, namely MOST, which uses fingerprint similarity and explicit bioactivity of the most-similar ligands to predict targets of the query compound. Performance of MOST was evaluated by using combinations of different fingerprint schemes, machine learning methods, and bioactivity representations. In sevenfold cross-validation with a benchmark Ki dataset from CHEMBL release 19 containing 61,937 bioactivity data of 173 human targets, MOST achieved high average prediction accuracy (0.95 for pKi ≥ 5, and 0.87 for pKi ≥ 6). Morgan fingerprint was shown to be slightly better than FP2. Logistic Regression and Random Forest methods performed better than Naïve Bayes. In a temporal validation, the Ki dataset from CHEMBL19 were used to train models and predict the bioactivity of newly deposited ligands in CHEMBL20. MOST also performed well with high accuracy (0.90 for pKi ≥ 5, and 0.76 for pKi ≥ 6), when Logistic Regression and Morgan fingerprint were employed. Furthermore, the p values associated with explicit bioactivity were found be a robust index for removing false positive predictions. Implicit bioactivity did not offer this capability. Finally, p values generated with Logistic Regression, Morgan fingerprint and explicit activity were integrated with a false discovery rate (FDR) control procedure to reduce false positives in multiple-target prediction scenario, and the success of this strategy it was demonstrated with a case of fluanisone. In the case of aloe-emodin’s laxative effect, MOST predicted that acetylcholinesterase was the mechanism-of-action target; in vivo studies validated this prediction. CONCLUSIONS: Using the MOST approach can result in highly accurate and robust target prediction. Integrated with a FDR control procedure, MOST provides a reliable framework for multiple-target inference. It has prospective applications in drug repurposing and mechanism-of-action target prediction. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1586-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5346209
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53462092017-03-14 MOST: most-similar ligand based approach to target prediction Huang, Tao Mi, Hong Lin, Cheng-yuan Zhao, Ling Zhong, Linda L. D. Liu, Feng-bin Zhang, Ge Lu, Ai-ping Bian, Zhao-xiang BMC Bioinformatics Methodology Article BACKGROUND: Many computational approaches have been used for target prediction, including machine learning, reverse docking, bioactivity spectra analysis, and chemical similarity searching. Recent studies have suggested that chemical similarity searching may be driven by the most-similar ligand. However, the extent of bioactivity of most-similar ligands has been oversimplified or even neglected in these studies, and this has impaired the prediction power. RESULTS: Here we propose the MOst-Similar ligand-based Target inference approach, namely MOST, which uses fingerprint similarity and explicit bioactivity of the most-similar ligands to predict targets of the query compound. Performance of MOST was evaluated by using combinations of different fingerprint schemes, machine learning methods, and bioactivity representations. In sevenfold cross-validation with a benchmark Ki dataset from CHEMBL release 19 containing 61,937 bioactivity data of 173 human targets, MOST achieved high average prediction accuracy (0.95 for pKi ≥ 5, and 0.87 for pKi ≥ 6). Morgan fingerprint was shown to be slightly better than FP2. Logistic Regression and Random Forest methods performed better than Naïve Bayes. In a temporal validation, the Ki dataset from CHEMBL19 were used to train models and predict the bioactivity of newly deposited ligands in CHEMBL20. MOST also performed well with high accuracy (0.90 for pKi ≥ 5, and 0.76 for pKi ≥ 6), when Logistic Regression and Morgan fingerprint were employed. Furthermore, the p values associated with explicit bioactivity were found be a robust index for removing false positive predictions. Implicit bioactivity did not offer this capability. Finally, p values generated with Logistic Regression, Morgan fingerprint and explicit activity were integrated with a false discovery rate (FDR) control procedure to reduce false positives in multiple-target prediction scenario, and the success of this strategy it was demonstrated with a case of fluanisone. In the case of aloe-emodin’s laxative effect, MOST predicted that acetylcholinesterase was the mechanism-of-action target; in vivo studies validated this prediction. CONCLUSIONS: Using the MOST approach can result in highly accurate and robust target prediction. Integrated with a FDR control procedure, MOST provides a reliable framework for multiple-target inference. It has prospective applications in drug repurposing and mechanism-of-action target prediction. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1586-z) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-11 /pmc/articles/PMC5346209/ /pubmed/28284192 http://dx.doi.org/10.1186/s12859-017-1586-z Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Huang, Tao
Mi, Hong
Lin, Cheng-yuan
Zhao, Ling
Zhong, Linda L. D.
Liu, Feng-bin
Zhang, Ge
Lu, Ai-ping
Bian, Zhao-xiang
MOST: most-similar ligand based approach to target prediction
title MOST: most-similar ligand based approach to target prediction
title_full MOST: most-similar ligand based approach to target prediction
title_fullStr MOST: most-similar ligand based approach to target prediction
title_full_unstemmed MOST: most-similar ligand based approach to target prediction
title_short MOST: most-similar ligand based approach to target prediction
title_sort most: most-similar ligand based approach to target prediction
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5346209/
https://www.ncbi.nlm.nih.gov/pubmed/28284192
http://dx.doi.org/10.1186/s12859-017-1586-z
work_keys_str_mv AT huangtao mostmostsimilarligandbasedapproachtotargetprediction
AT mihong mostmostsimilarligandbasedapproachtotargetprediction
AT linchengyuan mostmostsimilarligandbasedapproachtotargetprediction
AT zhaoling mostmostsimilarligandbasedapproachtotargetprediction
AT zhonglindald mostmostsimilarligandbasedapproachtotargetprediction
AT liufengbin mostmostsimilarligandbasedapproachtotargetprediction
AT zhangge mostmostsimilarligandbasedapproachtotargetprediction
AT luaiping mostmostsimilarligandbasedapproachtotargetprediction
AT bianzhaoxiang mostmostsimilarligandbasedapproachtotargetprediction
AT mostmostsimilarligandbasedapproachtotargetprediction