Cargando…
Fast rule-based bioactivity prediction using associative classification mining
Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of m...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3515428/ https://www.ncbi.nlm.nih.gov/pubmed/23176548 http://dx.doi.org/10.1186/1758-2946-4-29 |
_version_ | 1782252178219466752 |
---|---|
author | Yu, Pulan Wild, David J |
author_facet | Yu, Pulan Wild, David J |
author_sort | Yu, Pulan |
collection | PubMed |
description | Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models. |
format | Online Article Text |
id | pubmed-3515428 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35154282012-12-06 Fast rule-based bioactivity prediction using associative classification mining Yu, Pulan Wild, David J J Cheminform Methodology Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models. BioMed Central 2012-11-23 /pmc/articles/PMC3515428/ /pubmed/23176548 http://dx.doi.org/10.1186/1758-2946-4-29 Text en Copyright ©2012 Yu and Wild; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Yu, Pulan Wild, David J Fast rule-based bioactivity prediction using associative classification mining |
title | Fast rule-based bioactivity prediction using associative classification mining |
title_full | Fast rule-based bioactivity prediction using associative classification mining |
title_fullStr | Fast rule-based bioactivity prediction using associative classification mining |
title_full_unstemmed | Fast rule-based bioactivity prediction using associative classification mining |
title_short | Fast rule-based bioactivity prediction using associative classification mining |
title_sort | fast rule-based bioactivity prediction using associative classification mining |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3515428/ https://www.ncbi.nlm.nih.gov/pubmed/23176548 http://dx.doi.org/10.1186/1758-2946-4-29 |
work_keys_str_mv | AT yupulan fastrulebasedbioactivitypredictionusingassociativeclassificationmining AT wilddavidj fastrulebasedbioactivitypredictionusingassociativeclassificationmining |