Cargando…

Full “Laplacianised” posterior naive Bayesian algorithm

BACKGROUND: In the last decade the standard Naive Bayes (SNB) algorithm has been widely employed in multi–class classification problems in cheminformatics. This popularity is mainly due to the fact that the algorithm is simple to implement and in many cases yields respectable classification results....

Descripción completa

Detalles Bibliográficos
Autores principales: Mussa, Hamse Y, Mitchell, John BO, Glen, Robert C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3846418/
https://www.ncbi.nlm.nih.gov/pubmed/23968281
http://dx.doi.org/10.1186/1758-2946-5-37
_version_ 1782293426723618816
author Mussa, Hamse Y
Mitchell, John BO
Glen, Robert C
author_facet Mussa, Hamse Y
Mitchell, John BO
Glen, Robert C
author_sort Mussa, Hamse Y
collection PubMed
description BACKGROUND: In the last decade the standard Naive Bayes (SNB) algorithm has been widely employed in multi–class classification problems in cheminformatics. This popularity is mainly due to the fact that the algorithm is simple to implement and in many cases yields respectable classification results. Using clever heuristic arguments “anchored” by insightful cheminformatics knowledge, Xia et al. have simplified the SNB algorithm further and termed it the Laplacian Corrected Modified Naive Bayes (LCMNB) approach, which has been widely used in cheminformatics since its publication. In this note we mathematically illustrate the conditions under which Xia et al.’s simplification holds. It is our hope that this clarification could help Naive Bayes practitioners in deciding when it is appropriate to employ the LCMNB algorithm to classify large chemical datasets. RESULTS: A general formulation that subsumes the simplified Naive Bayes version is presented. Unlike the widely used NB method, the Standard Naive Bayes description presented in this work is discriminative (not generative) in nature, which may lead to possible further applications of the SNB method. CONCLUSIONS: Starting from a standard Naive Bayes (SNB) algorithm, we have derived mathematically the relationship between Xia et al.’s ingenious, but heuristic algorithm, and the SNB approach. We have also demonstrated the conditions under which Xia et al.’s crucial assumptions hold. We therefore hope that the new insight and recommendations provided can be found useful by the cheminformatics community.
format Online
Article
Text
id pubmed-3846418
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38464182013-12-06 Full “Laplacianised” posterior naive Bayesian algorithm Mussa, Hamse Y Mitchell, John BO Glen, Robert C J Cheminform Research Article BACKGROUND: In the last decade the standard Naive Bayes (SNB) algorithm has been widely employed in multi–class classification problems in cheminformatics. This popularity is mainly due to the fact that the algorithm is simple to implement and in many cases yields respectable classification results. Using clever heuristic arguments “anchored” by insightful cheminformatics knowledge, Xia et al. have simplified the SNB algorithm further and termed it the Laplacian Corrected Modified Naive Bayes (LCMNB) approach, which has been widely used in cheminformatics since its publication. In this note we mathematically illustrate the conditions under which Xia et al.’s simplification holds. It is our hope that this clarification could help Naive Bayes practitioners in deciding when it is appropriate to employ the LCMNB algorithm to classify large chemical datasets. RESULTS: A general formulation that subsumes the simplified Naive Bayes version is presented. Unlike the widely used NB method, the Standard Naive Bayes description presented in this work is discriminative (not generative) in nature, which may lead to possible further applications of the SNB method. CONCLUSIONS: Starting from a standard Naive Bayes (SNB) algorithm, we have derived mathematically the relationship between Xia et al.’s ingenious, but heuristic algorithm, and the SNB approach. We have also demonstrated the conditions under which Xia et al.’s crucial assumptions hold. We therefore hope that the new insight and recommendations provided can be found useful by the cheminformatics community. BioMed Central 2013-08-23 /pmc/articles/PMC3846418/ /pubmed/23968281 http://dx.doi.org/10.1186/1758-2946-5-37 Text en Copyright © 2013 Mussa et al.; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Mussa, Hamse Y
Mitchell, John BO
Glen, Robert C
Full “Laplacianised” posterior naive Bayesian algorithm
title Full “Laplacianised” posterior naive Bayesian algorithm
title_full Full “Laplacianised” posterior naive Bayesian algorithm
title_fullStr Full “Laplacianised” posterior naive Bayesian algorithm
title_full_unstemmed Full “Laplacianised” posterior naive Bayesian algorithm
title_short Full “Laplacianised” posterior naive Bayesian algorithm
title_sort full “laplacianised” posterior naive bayesian algorithm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3846418/
https://www.ncbi.nlm.nih.gov/pubmed/23968281
http://dx.doi.org/10.1186/1758-2946-5-37
work_keys_str_mv AT mussahamsey fulllaplacianisedposteriornaivebayesianalgorithm
AT mitchelljohnbo fulllaplacianisedposteriornaivebayesianalgorithm
AT glenrobertc fulllaplacianisedposteriornaivebayesianalgorithm