Cargando…

Inverse similarity and reliable negative samples for drug side-effect prediction

BACKGROUND: In silico prediction of potential drug side-effects is of crucial importance for drug development, since wet experimental identification of drug side-effects is expensive and time-consuming. Existing computational methods mainly focus on leveraging validated drug side-effect relations fo...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zheng, Yi, Peng, Hui, Ghosh, Shameek, Lan, Chaowang, Li, Jinyan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7402513/ https://www.ncbi.nlm.nih.gov/pubmed/30717666 http://dx.doi.org/10.1186/s12859-018-2563-x

_version_	1783566772399505408
author	Zheng, Yi Peng, Hui Ghosh, Shameek Lan, Chaowang Li, Jinyan
author_facet	Zheng, Yi Peng, Hui Ghosh, Shameek Lan, Chaowang Li, Jinyan
author_sort	Zheng, Yi
collection	PubMed
description	BACKGROUND: In silico prediction of potential drug side-effects is of crucial importance for drug development, since wet experimental identification of drug side-effects is expensive and time-consuming. Existing computational methods mainly focus on leveraging validated drug side-effect relations for the prediction. The performance is severely impeded by the lack of reliable negative training data. Thus, a method to select reliable negative samples becomes vital in the performance improvement. METHODS: Most of the existing computational prediction methods are essentially based on the assumption that similar drugs are inclined to share the same side-effects, which has given rise to remarkable performance. It is also rational to assume an inverse proposition that dissimilar drugs are less likely to share the same side-effects. Based on this inverse similarity hypothesis, we proposed a novel method to select highly-reliable negative samples for side-effect prediction. The first step of our method is to build a drug similarity integration framework to measure the similarity between drugs from different perspectives. This step integrates drug chemical structures, drug target proteins, drug substituents, and drug therapeutic information as features into a unified framework. Then, a similarity score between each candidate negative drug and validated positive drugs is calculated using the similarity integration framework. Those candidate negative drugs with lower similarity scores are preferentially selected as negative samples. Finally, both the validated positive drugs and the selected highly-reliable negative samples are used for predictions. RESULTS: The performance of the proposed method was evaluated on simulative side-effect prediction of 917 DrugBank drugs, comparing with four machine-learning algorithms. Extensive experiments show that the drug similarity integration framework has superior capability in capturing drug features, achieving much better performance than those based on a single type of drug property. Besides, the four machine-learning algorithms achieved significant improvement in macro-averaging F1-score (e.g., SVM from 0.655 to 0.898), macro-averaging precision (e.g., RBF from 0.592 to 0.828) and macro-averaging recall (e.g., KNN from 0.651 to 0.772) complimentarily attributed to the highly-reliable negative samples selected by the proposed method. CONCLUSIONS: The results suggest that the inverse similarity hypothesis and the integration of different drug properties are valuable for side-effect prediction. The selection of highly-reliable negative samples can also make significant contributions to the performance improvement. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2563-x) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-7402513
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-74025132020-08-07 Inverse similarity and reliable negative samples for drug side-effect prediction Zheng, Yi Peng, Hui Ghosh, Shameek Lan, Chaowang Li, Jinyan BMC Bioinformatics Research BACKGROUND: In silico prediction of potential drug side-effects is of crucial importance for drug development, since wet experimental identification of drug side-effects is expensive and time-consuming. Existing computational methods mainly focus on leveraging validated drug side-effect relations for the prediction. The performance is severely impeded by the lack of reliable negative training data. Thus, a method to select reliable negative samples becomes vital in the performance improvement. METHODS: Most of the existing computational prediction methods are essentially based on the assumption that similar drugs are inclined to share the same side-effects, which has given rise to remarkable performance. It is also rational to assume an inverse proposition that dissimilar drugs are less likely to share the same side-effects. Based on this inverse similarity hypothesis, we proposed a novel method to select highly-reliable negative samples for side-effect prediction. The first step of our method is to build a drug similarity integration framework to measure the similarity between drugs from different perspectives. This step integrates drug chemical structures, drug target proteins, drug substituents, and drug therapeutic information as features into a unified framework. Then, a similarity score between each candidate negative drug and validated positive drugs is calculated using the similarity integration framework. Those candidate negative drugs with lower similarity scores are preferentially selected as negative samples. Finally, both the validated positive drugs and the selected highly-reliable negative samples are used for predictions. RESULTS: The performance of the proposed method was evaluated on simulative side-effect prediction of 917 DrugBank drugs, comparing with four machine-learning algorithms. Extensive experiments show that the drug similarity integration framework has superior capability in capturing drug features, achieving much better performance than those based on a single type of drug property. Besides, the four machine-learning algorithms achieved significant improvement in macro-averaging F1-score (e.g., SVM from 0.655 to 0.898), macro-averaging precision (e.g., RBF from 0.592 to 0.828) and macro-averaging recall (e.g., KNN from 0.651 to 0.772) complimentarily attributed to the highly-reliable negative samples selected by the proposed method. CONCLUSIONS: The results suggest that the inverse similarity hypothesis and the integration of different drug properties are valuable for side-effect prediction. The selection of highly-reliable negative samples can also make significant contributions to the performance improvement. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2563-x) contains supplementary material, which is available to authorized users. BioMed Central 2019-02-04 /pmc/articles/PMC7402513/ /pubmed/30717666 http://dx.doi.org/10.1186/s12859-018-2563-x Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Zheng, Yi Peng, Hui Ghosh, Shameek Lan, Chaowang Li, Jinyan Inverse similarity and reliable negative samples for drug side-effect prediction
title	Inverse similarity and reliable negative samples for drug side-effect prediction
title_full	Inverse similarity and reliable negative samples for drug side-effect prediction
title_fullStr	Inverse similarity and reliable negative samples for drug side-effect prediction
title_full_unstemmed	Inverse similarity and reliable negative samples for drug side-effect prediction
title_short	Inverse similarity and reliable negative samples for drug side-effect prediction
title_sort	inverse similarity and reliable negative samples for drug side-effect prediction
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7402513/ https://www.ncbi.nlm.nih.gov/pubmed/30717666 http://dx.doi.org/10.1186/s12859-018-2563-x
work_keys_str_mv	AT zhengyi inversesimilarityandreliablenegativesamplesfordrugsideeffectprediction AT penghui inversesimilarityandreliablenegativesamplesfordrugsideeffectprediction AT ghoshshameek inversesimilarityandreliablenegativesamplesfordrugsideeffectprediction AT lanchaowang inversesimilarityandreliablenegativesamplesfordrugsideeffectprediction AT lijinyan inversesimilarityandreliablenegativesamplesfordrugsideeffectprediction

Inverse similarity and reliable negative samples for drug side-effect prediction

Ejemplares similares