Cargando…

Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition

BACKGROUND: In the field of drug repositioning, it is assumed that similar drugs may treat similar diseases, therefore many existing computational methods need to compute the similarities of drugs and diseases. However, the calculation of similarity depends on the adopted measure and the available f...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Guangsheng, Liu, Juan, Yue, Xiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439991/
https://www.ncbi.nlm.nih.gov/pubmed/30925858
http://dx.doi.org/10.1186/s12859-019-2644-5
_version_ 1783407306133733376
author Wu, Guangsheng
Liu, Juan
Yue, Xiang
author_facet Wu, Guangsheng
Liu, Juan
Yue, Xiang
author_sort Wu, Guangsheng
collection PubMed
description BACKGROUND: In the field of drug repositioning, it is assumed that similar drugs may treat similar diseases, therefore many existing computational methods need to compute the similarities of drugs and diseases. However, the calculation of similarity depends on the adopted measure and the available features, which may lead that the similarity scores vary dramatically from one to another, and it will not work when facing the incomplete data. Besides, supervised learning based methods usually need both positive and negative samples to train the prediction models, whereas in drug-disease pairs data there are only some verified interactions (positive samples) and a lot of unlabeled pairs. To train the models, many methods simply treat the unlabeled samples as negative ones, which may introduce artificial noises. Herein, we propose a method to predict drug-disease associations without the need of similarity information, and select more likely negative samples. RESULTS: In the proposed EMP-SVD (Ensemble Meta Paths and Singular Value Decomposition), we introduce five meta paths corresponding to different kinds of interaction data, and for each meta path we generate a commuting matrix. Every matrix is factorized into two low rank matrices by SVD which are used for the latent features of drugs and diseases respectively. The features are combined to represent drug-disease pairs. We build a base classifier via Random Forest for each meta path and five base classifiers are combined as the final ensemble classifier. In order to train out a more reliable prediction model, we select more likely negative ones from unlabeled samples under the assumption that non-associated drug and disease pair have no common interacted proteins. The experiments have shown that the proposed EMP-SVD method outperforms several state-of-the-art approaches. Case studies by literature investigation have found that the proposed EMP-SVD can mine out many drug-disease associations, which implies the practicality of EMP-SVD. CONCLUSIONS: The proposed EMP-SVD can integrate the interaction data among drugs, proteins and diseases, and predict the drug-disease associations without the need of similarity information. At the same time, the strategy of selecting more reliable negative samples will benefit the prediction.
format Online
Article
Text
id pubmed-6439991
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64399912019-04-11 Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition Wu, Guangsheng Liu, Juan Yue, Xiang BMC Bioinformatics Research BACKGROUND: In the field of drug repositioning, it is assumed that similar drugs may treat similar diseases, therefore many existing computational methods need to compute the similarities of drugs and diseases. However, the calculation of similarity depends on the adopted measure and the available features, which may lead that the similarity scores vary dramatically from one to another, and it will not work when facing the incomplete data. Besides, supervised learning based methods usually need both positive and negative samples to train the prediction models, whereas in drug-disease pairs data there are only some verified interactions (positive samples) and a lot of unlabeled pairs. To train the models, many methods simply treat the unlabeled samples as negative ones, which may introduce artificial noises. Herein, we propose a method to predict drug-disease associations without the need of similarity information, and select more likely negative samples. RESULTS: In the proposed EMP-SVD (Ensemble Meta Paths and Singular Value Decomposition), we introduce five meta paths corresponding to different kinds of interaction data, and for each meta path we generate a commuting matrix. Every matrix is factorized into two low rank matrices by SVD which are used for the latent features of drugs and diseases respectively. The features are combined to represent drug-disease pairs. We build a base classifier via Random Forest for each meta path and five base classifiers are combined as the final ensemble classifier. In order to train out a more reliable prediction model, we select more likely negative ones from unlabeled samples under the assumption that non-associated drug and disease pair have no common interacted proteins. The experiments have shown that the proposed EMP-SVD method outperforms several state-of-the-art approaches. Case studies by literature investigation have found that the proposed EMP-SVD can mine out many drug-disease associations, which implies the practicality of EMP-SVD. CONCLUSIONS: The proposed EMP-SVD can integrate the interaction data among drugs, proteins and diseases, and predict the drug-disease associations without the need of similarity information. At the same time, the strategy of selecting more reliable negative samples will benefit the prediction. BioMed Central 2019-03-29 /pmc/articles/PMC6439991/ /pubmed/30925858 http://dx.doi.org/10.1186/s12859-019-2644-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wu, Guangsheng
Liu, Juan
Yue, Xiang
Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition
title Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition
title_full Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition
title_fullStr Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition
title_full_unstemmed Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition
title_short Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition
title_sort prediction of drug-disease associations based on ensemble meta paths and singular value decomposition
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439991/
https://www.ncbi.nlm.nih.gov/pubmed/30925858
http://dx.doi.org/10.1186/s12859-019-2644-5
work_keys_str_mv AT wuguangsheng predictionofdrugdiseaseassociationsbasedonensemblemetapathsandsingularvaluedecomposition
AT liujuan predictionofdrugdiseaseassociationsbasedonensemblemetapathsandsingularvaluedecomposition
AT yuexiang predictionofdrugdiseaseassociationsbasedonensemblemetapathsandsingularvaluedecomposition