Cargando…

Predicting drug side effects by multi-label learning and ensemble learning

BACKGROUND: Predicting drug side effects is an important topic in the drug discovery. Although several machine learning methods have been proposed to predict side effects, there is still space for improvements. Firstly, the side effect prediction is a multi-label learning task, and we can adopt the...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Wen, Liu, Feng, Luo, Longqiang, Zhang, Jingxia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4634905/
https://www.ncbi.nlm.nih.gov/pubmed/26537615
http://dx.doi.org/10.1186/s12859-015-0774-y
_version_ 1782399436043845632
author Zhang, Wen
Liu, Feng
Luo, Longqiang
Zhang, Jingxia
author_facet Zhang, Wen
Liu, Feng
Luo, Longqiang
Zhang, Jingxia
author_sort Zhang, Wen
collection PubMed
description BACKGROUND: Predicting drug side effects is an important topic in the drug discovery. Although several machine learning methods have been proposed to predict side effects, there is still space for improvements. Firstly, the side effect prediction is a multi-label learning task, and we can adopt the multi-label learning techniques for it. Secondly, drug-related features are associated with side effects, and feature dimensions have specific biological meanings. Recognizing critical dimensions and reducing irrelevant dimensions may help to reveal the causes of side effects. METHODS: In this paper, we propose a novel method ‘feature selection-based multi-label k-nearest neighbor method’ (FS-MLKNN), which can simultaneously determine critical feature dimensions and construct high-accuracy multi-label prediction models. RESULTS: Computational experiments demonstrate that FS-MLKNN leads to good performances as well as explainable results. To achieve better performances, we further develop the ensemble learning model by integrating individual feature-based FS-MLKNN models. When compared with other state-of-the-art methods, the ensemble method produces better performances on benchmark datasets. CONCLUSIONS: In conclusion, FS-MLKNN and the ensemble method are promising tools for the side effect prediction. The source code and datasets are available in the Additional file 1. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0774-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4634905
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46349052015-11-06 Predicting drug side effects by multi-label learning and ensemble learning Zhang, Wen Liu, Feng Luo, Longqiang Zhang, Jingxia BMC Bioinformatics Research Article BACKGROUND: Predicting drug side effects is an important topic in the drug discovery. Although several machine learning methods have been proposed to predict side effects, there is still space for improvements. Firstly, the side effect prediction is a multi-label learning task, and we can adopt the multi-label learning techniques for it. Secondly, drug-related features are associated with side effects, and feature dimensions have specific biological meanings. Recognizing critical dimensions and reducing irrelevant dimensions may help to reveal the causes of side effects. METHODS: In this paper, we propose a novel method ‘feature selection-based multi-label k-nearest neighbor method’ (FS-MLKNN), which can simultaneously determine critical feature dimensions and construct high-accuracy multi-label prediction models. RESULTS: Computational experiments demonstrate that FS-MLKNN leads to good performances as well as explainable results. To achieve better performances, we further develop the ensemble learning model by integrating individual feature-based FS-MLKNN models. When compared with other state-of-the-art methods, the ensemble method produces better performances on benchmark datasets. CONCLUSIONS: In conclusion, FS-MLKNN and the ensemble method are promising tools for the side effect prediction. The source code and datasets are available in the Additional file 1. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0774-y) contains supplementary material, which is available to authorized users. BioMed Central 2015-11-04 /pmc/articles/PMC4634905/ /pubmed/26537615 http://dx.doi.org/10.1186/s12859-015-0774-y Text en © Zhang et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zhang, Wen
Liu, Feng
Luo, Longqiang
Zhang, Jingxia
Predicting drug side effects by multi-label learning and ensemble learning
title Predicting drug side effects by multi-label learning and ensemble learning
title_full Predicting drug side effects by multi-label learning and ensemble learning
title_fullStr Predicting drug side effects by multi-label learning and ensemble learning
title_full_unstemmed Predicting drug side effects by multi-label learning and ensemble learning
title_short Predicting drug side effects by multi-label learning and ensemble learning
title_sort predicting drug side effects by multi-label learning and ensemble learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4634905/
https://www.ncbi.nlm.nih.gov/pubmed/26537615
http://dx.doi.org/10.1186/s12859-015-0774-y
work_keys_str_mv AT zhangwen predictingdrugsideeffectsbymultilabellearningandensemblelearning
AT liufeng predictingdrugsideeffectsbymultilabellearningandensemblelearning
AT luolongqiang predictingdrugsideeffectsbymultilabellearningandensemblelearning
AT zhangjingxia predictingdrugsideeffectsbymultilabellearningandensemblelearning