Cargando…
Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition
BACKGROUND: In the field of drug repositioning, it is assumed that similar drugs may treat similar diseases, therefore many existing computational methods need to compute the similarities of drugs and diseases. However, the calculation of similarity depends on the adopted measure and the available f...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439991/ https://www.ncbi.nlm.nih.gov/pubmed/30925858 http://dx.doi.org/10.1186/s12859-019-2644-5 |
_version_ | 1783407306133733376 |
---|---|
author | Wu, Guangsheng Liu, Juan Yue, Xiang |
author_facet | Wu, Guangsheng Liu, Juan Yue, Xiang |
author_sort | Wu, Guangsheng |
collection | PubMed |
description | BACKGROUND: In the field of drug repositioning, it is assumed that similar drugs may treat similar diseases, therefore many existing computational methods need to compute the similarities of drugs and diseases. However, the calculation of similarity depends on the adopted measure and the available features, which may lead that the similarity scores vary dramatically from one to another, and it will not work when facing the incomplete data. Besides, supervised learning based methods usually need both positive and negative samples to train the prediction models, whereas in drug-disease pairs data there are only some verified interactions (positive samples) and a lot of unlabeled pairs. To train the models, many methods simply treat the unlabeled samples as negative ones, which may introduce artificial noises. Herein, we propose a method to predict drug-disease associations without the need of similarity information, and select more likely negative samples. RESULTS: In the proposed EMP-SVD (Ensemble Meta Paths and Singular Value Decomposition), we introduce five meta paths corresponding to different kinds of interaction data, and for each meta path we generate a commuting matrix. Every matrix is factorized into two low rank matrices by SVD which are used for the latent features of drugs and diseases respectively. The features are combined to represent drug-disease pairs. We build a base classifier via Random Forest for each meta path and five base classifiers are combined as the final ensemble classifier. In order to train out a more reliable prediction model, we select more likely negative ones from unlabeled samples under the assumption that non-associated drug and disease pair have no common interacted proteins. The experiments have shown that the proposed EMP-SVD method outperforms several state-of-the-art approaches. Case studies by literature investigation have found that the proposed EMP-SVD can mine out many drug-disease associations, which implies the practicality of EMP-SVD. CONCLUSIONS: The proposed EMP-SVD can integrate the interaction data among drugs, proteins and diseases, and predict the drug-disease associations without the need of similarity information. At the same time, the strategy of selecting more reliable negative samples will benefit the prediction. |
format | Online Article Text |
id | pubmed-6439991 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-64399912019-04-11 Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition Wu, Guangsheng Liu, Juan Yue, Xiang BMC Bioinformatics Research BACKGROUND: In the field of drug repositioning, it is assumed that similar drugs may treat similar diseases, therefore many existing computational methods need to compute the similarities of drugs and diseases. However, the calculation of similarity depends on the adopted measure and the available features, which may lead that the similarity scores vary dramatically from one to another, and it will not work when facing the incomplete data. Besides, supervised learning based methods usually need both positive and negative samples to train the prediction models, whereas in drug-disease pairs data there are only some verified interactions (positive samples) and a lot of unlabeled pairs. To train the models, many methods simply treat the unlabeled samples as negative ones, which may introduce artificial noises. Herein, we propose a method to predict drug-disease associations without the need of similarity information, and select more likely negative samples. RESULTS: In the proposed EMP-SVD (Ensemble Meta Paths and Singular Value Decomposition), we introduce five meta paths corresponding to different kinds of interaction data, and for each meta path we generate a commuting matrix. Every matrix is factorized into two low rank matrices by SVD which are used for the latent features of drugs and diseases respectively. The features are combined to represent drug-disease pairs. We build a base classifier via Random Forest for each meta path and five base classifiers are combined as the final ensemble classifier. In order to train out a more reliable prediction model, we select more likely negative ones from unlabeled samples under the assumption that non-associated drug and disease pair have no common interacted proteins. The experiments have shown that the proposed EMP-SVD method outperforms several state-of-the-art approaches. Case studies by literature investigation have found that the proposed EMP-SVD can mine out many drug-disease associations, which implies the practicality of EMP-SVD. CONCLUSIONS: The proposed EMP-SVD can integrate the interaction data among drugs, proteins and diseases, and predict the drug-disease associations without the need of similarity information. At the same time, the strategy of selecting more reliable negative samples will benefit the prediction. BioMed Central 2019-03-29 /pmc/articles/PMC6439991/ /pubmed/30925858 http://dx.doi.org/10.1186/s12859-019-2644-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Wu, Guangsheng Liu, Juan Yue, Xiang Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition |
title | Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition |
title_full | Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition |
title_fullStr | Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition |
title_full_unstemmed | Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition |
title_short | Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition |
title_sort | prediction of drug-disease associations based on ensemble meta paths and singular value decomposition |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439991/ https://www.ncbi.nlm.nih.gov/pubmed/30925858 http://dx.doi.org/10.1186/s12859-019-2644-5 |
work_keys_str_mv | AT wuguangsheng predictionofdrugdiseaseassociationsbasedonensemblemetapathsandsingularvaluedecomposition AT liujuan predictionofdrugdiseaseassociationsbasedonensemblemetapathsandsingularvaluedecomposition AT yuexiang predictionofdrugdiseaseassociationsbasedonensemblemetapathsandsingularvaluedecomposition |