Cargando…
Drug activity prediction using multiple-instance learning via joint instance and feature selection
BACKGROUND: In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental di...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850986/ https://www.ncbi.nlm.nih.gov/pubmed/24267824 http://dx.doi.org/10.1186/1471-2105-14-S14-S16 |
_version_ | 1782294204925345792 |
---|---|
author | Zhao, Zhendong Fu, Gang Liu, Sheng Elokely, Khaled M Doerksen, Robert J Chen, Yixin Wilkins, Dawn E |
author_facet | Zhao, Zhendong Fu, Gang Liu, Sheng Elokely, Khaled M Doerksen, Robert J Chen, Yixin Wilkins, Dawn E |
author_sort | Zhao, Zhendong |
collection | PubMed |
description | BACKGROUND: In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental difficulty in obtaining the bioactive conformers, computational approaches such as machine learning techniques are much needed. Multiple Instance Learning (MIL) is a machine learning method capable of tackling this type of problem. In the MIL framework, each instance is represented as a feature vector, which usually resides in a high-dimensional feature space. The high dimensionality may provide significant information for learning tasks, but at the same time it may also include a large number of irrelevant or redundant features that might negatively affect learning performance. Reducing the dimensionality of data will hence facilitate the classification task and improve the interpretability of the model. RESULTS: In this work we propose a novel approach, named multiple instance learning via joint instance and feature selection. The iterative joint instance and feature selection is achieved using an instance-based feature mapping and 1-norm regularized optimization. The proposed approach was tested on four biological activity datasets. CONCLUSIONS: The empirical results demonstrate that the selected instances (prototype conformers) and features (pharmacophore fingerprints) have competitive discriminative power and the convergence of the selection process is also fast. |
format | Online Article Text |
id | pubmed-3850986 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-38509862013-12-13 Drug activity prediction using multiple-instance learning via joint instance and feature selection Zhao, Zhendong Fu, Gang Liu, Sheng Elokely, Khaled M Doerksen, Robert J Chen, Yixin Wilkins, Dawn E BMC Bioinformatics Proceedings BACKGROUND: In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental difficulty in obtaining the bioactive conformers, computational approaches such as machine learning techniques are much needed. Multiple Instance Learning (MIL) is a machine learning method capable of tackling this type of problem. In the MIL framework, each instance is represented as a feature vector, which usually resides in a high-dimensional feature space. The high dimensionality may provide significant information for learning tasks, but at the same time it may also include a large number of irrelevant or redundant features that might negatively affect learning performance. Reducing the dimensionality of data will hence facilitate the classification task and improve the interpretability of the model. RESULTS: In this work we propose a novel approach, named multiple instance learning via joint instance and feature selection. The iterative joint instance and feature selection is achieved using an instance-based feature mapping and 1-norm regularized optimization. The proposed approach was tested on four biological activity datasets. CONCLUSIONS: The empirical results demonstrate that the selected instances (prototype conformers) and features (pharmacophore fingerprints) have competitive discriminative power and the convergence of the selection process is also fast. BioMed Central 2013-10-09 /pmc/articles/PMC3850986/ /pubmed/24267824 http://dx.doi.org/10.1186/1471-2105-14-S14-S16 Text en Copyright © 2013 Zhao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Zhao, Zhendong Fu, Gang Liu, Sheng Elokely, Khaled M Doerksen, Robert J Chen, Yixin Wilkins, Dawn E Drug activity prediction using multiple-instance learning via joint instance and feature selection |
title | Drug activity prediction using multiple-instance learning via joint instance and feature selection |
title_full | Drug activity prediction using multiple-instance learning via joint instance and feature selection |
title_fullStr | Drug activity prediction using multiple-instance learning via joint instance and feature selection |
title_full_unstemmed | Drug activity prediction using multiple-instance learning via joint instance and feature selection |
title_short | Drug activity prediction using multiple-instance learning via joint instance and feature selection |
title_sort | drug activity prediction using multiple-instance learning via joint instance and feature selection |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850986/ https://www.ncbi.nlm.nih.gov/pubmed/24267824 http://dx.doi.org/10.1186/1471-2105-14-S14-S16 |
work_keys_str_mv | AT zhaozhendong drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection AT fugang drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection AT liusheng drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection AT elokelykhaledm drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection AT doerksenrobertj drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection AT chenyixin drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection AT wilkinsdawne drugactivitypredictionusingmultipleinstancelearningviajointinstanceandfeatureselection |