Cargando…

Fast rule-based bioactivity prediction using associative classification mining

Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of m...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Pulan, Wild, David J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3515428/
https://www.ncbi.nlm.nih.gov/pubmed/23176548
http://dx.doi.org/10.1186/1758-2946-4-29
_version_ 1782252178219466752
author Yu, Pulan
Wild, David J
author_facet Yu, Pulan
Wild, David J
author_sort Yu, Pulan
collection PubMed
description Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models.
format Online
Article
Text
id pubmed-3515428
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35154282012-12-06 Fast rule-based bioactivity prediction using associative classification mining Yu, Pulan Wild, David J J Cheminform Methodology Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models. BioMed Central 2012-11-23 /pmc/articles/PMC3515428/ /pubmed/23176548 http://dx.doi.org/10.1186/1758-2946-4-29 Text en Copyright ©2012 Yu and Wild; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology
Yu, Pulan
Wild, David J
Fast rule-based bioactivity prediction using associative classification mining
title Fast rule-based bioactivity prediction using associative classification mining
title_full Fast rule-based bioactivity prediction using associative classification mining
title_fullStr Fast rule-based bioactivity prediction using associative classification mining
title_full_unstemmed Fast rule-based bioactivity prediction using associative classification mining
title_short Fast rule-based bioactivity prediction using associative classification mining
title_sort fast rule-based bioactivity prediction using associative classification mining
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3515428/
https://www.ncbi.nlm.nih.gov/pubmed/23176548
http://dx.doi.org/10.1186/1758-2946-4-29
work_keys_str_mv AT yupulan fastrulebasedbioactivitypredictionusingassociativeclassificationmining
AT wilddavidj fastrulebasedbioactivitypredictionusingassociativeclassificationmining