Cargando…

Drug activity prediction using multiple-instance learning via joint instance and feature selection

BACKGROUND: In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental di...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Zhendong, Fu, Gang, Liu, Sheng, Elokely, Khaled M, Doerksen, Robert J, Chen, Yixin, Wilkins, Dawn E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850986/
https://www.ncbi.nlm.nih.gov/pubmed/24267824
http://dx.doi.org/10.1186/1471-2105-14-S14-S16
_version_ 1782294204925345792
author Zhao, Zhendong
Fu, Gang
Liu, Sheng
Elokely, Khaled M
Doerksen, Robert J
Chen, Yixin
Wilkins, Dawn E
author_facet Zhao, Zhendong
Fu, Gang
Liu, Sheng
Elokely, Khaled M
Doerksen, Robert J
Chen, Yixin
Wilkins, Dawn E
author_sort Zhao, Zhendong
collection PubMed
description BACKGROUND: In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental difficulty in obtaining the bioactive conformers, computational approaches such as machine learning techniques are much needed. Multiple Instance Learning (MIL) is a machine learning method capable of tackling this type of problem. In the MIL framework, each instance is represented as a feature vector, which usually resides in a high-dimensional feature space. The high dimensionality may provide significant information for learning tasks, but at the same time it may also include a large number of irrelevant or redundant features that might negatively affect learning performance. Reducing the dimensionality of data will hence facilitate the classification task and improve the interpretability of the model. RESULTS: In this work we propose a novel approach, named multiple instance learning via joint instance and feature selection. The iterative joint instance and feature selection is achieved using an instance-based feature mapping and 1-norm regularized optimization. The proposed approach was tested on four biological activity datasets. CONCLUSIONS: The empirical results demonstrate that the selected instances (prototype conformers) and features (pharmacophore fingerprints) have competitive discriminative power and the convergence of the selection process is also fast.
format Online
Article
Text
id pubmed-3850986
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38509862013-12-13 Drug activity prediction using multiple-instance learning via joint instance and feature selection Zhao, Zhendong Fu, Gang Liu, Sheng Elokely, Khaled M Doerksen, Robert J Chen, Yixin Wilkins, Dawn E BMC Bioinformatics Proceedings BACKGROUND: In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental difficulty in obtaining the bioactive conformers, computational approaches such as machine learning techniques are much needed. Multiple Instance Learning (MIL) is a machine learning method capable of tackling this type of problem. In the MIL framework, each instance is represented as a feature vector, which usually resides in a high-dimensional feature space. The high dimensionality may provide significant information for learning tasks, but at the same time it may also include a large number of irrelevant or redundant features that might negatively affect learning performance. Reducing the dimensionality of data will hence facilitate the classification task and improve the interpretability of the model. RESULTS: In this work we propose a novel approach, named multiple instance learning via joint instance and feature selection. The iterative joint instance and feature selection is achieved using an instance-based feature mapping and 1-norm regularized optimization. The proposed approach was tested on four biological activity datasets. CONCLUSIONS: The empirical results demonstrate that the selected instances (prototype conformers) and features (pharmacophore fingerprints) have competitive discriminative power and the convergence of the selection process is also fast. BioMed Central 2013-10-09 /pmc/articles/PMC3850986/ /pubmed/24267824 http://dx.doi.org/10.1186/1471-2105-14-S14-S16 Text en Copyright © 2013 Zhao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Zhao, Zhendong
Fu, Gang
Liu, Sheng
Elokely, Khaled M
Doerksen, Robert J
Chen, Yixin
Wilkins, Dawn E
Drug activity prediction using multiple-instance learning via joint instance and feature selection
title Drug activity prediction using multiple-instance learning via joint instance and feature selection
title_full Drug activity prediction using multiple-instance learning via joint instance and feature selection
title_fullStr Drug activity prediction using multiple-instance learning via joint instance and feature selection
title_full_unstemmed Drug activity prediction using multiple-instance learning via joint instance and feature selection
title_short Drug activity prediction using multiple-instance learning via joint instance and feature selection
title_sort drug activity prediction using multiple-instance learning via joint instance and feature selection
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850986/
https://www.ncbi.nlm.nih.gov/pubmed/24267824
http://dx.doi.org/10.1186/1471-2105-14-S14-S16
work_keys_str_mv AT zhaozhendong drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection
AT fugang drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection
AT liusheng drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection
AT elokelykhaledm drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection
AT doerksenrobertj drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection
AT chenyixin drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection
AT wilkinsdawne drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection