Cargando…

Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis

The naïve Bayes classifier (NBC) is one of the most popular classifiers for class prediction or pattern recognition from microarray gene expression data (MGED). However, it is very much sensitive to outliers with the classical estimates of the location and scale parameters. It is one of the most imp...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahmed, Md. Shakil, Shahjaman, Md., Rana, Md. Masud, Mollah, Md. Nurul Haque
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5564130/
https://www.ncbi.nlm.nih.gov/pubmed/28848763
http://dx.doi.org/10.1155/2017/3020627
_version_ 1783258211898359808
author Ahmed, Md. Shakil
Shahjaman, Md.
Rana, Md. Masud
Mollah, Md. Nurul Haque
author_facet Ahmed, Md. Shakil
Shahjaman, Md.
Rana, Md. Masud
Mollah, Md. Nurul Haque
author_sort Ahmed, Md. Shakil
collection PubMed
description The naïve Bayes classifier (NBC) is one of the most popular classifiers for class prediction or pattern recognition from microarray gene expression data (MGED). However, it is very much sensitive to outliers with the classical estimates of the location and scale parameters. It is one of the most important drawbacks for gene expression data analysis by the classical NBC. The gene expression dataset is often contaminated by outliers due to several steps involved in the data generating process from hybridization of DNA samples to image analysis. Therefore, in this paper, an attempt is made to robustify the Gaussian NBC by the minimum β-divergence method. The role of minimum β-divergence method in this article is to produce the robust estimators for the location and scale parameters based on the training dataset and outlier detection and modification in test dataset. The performance of the proposed method depends on the tuning parameter β. It reduces to the traditional naïve Bayes classifier when β → 0. We investigated the performance of the proposed beta naïve Bayes classifier (β-NBC) in a comparison with some popular existing classifiers (NBC, KNN, SVM, and AdaBoost) using both simulated and real gene expression datasets. We observed that the proposed method improved the performance over the others in presence of outliers. Otherwise, it keeps almost equal performance.
format Online
Article
Text
id pubmed-5564130
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-55641302017-08-28 Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis Ahmed, Md. Shakil Shahjaman, Md. Rana, Md. Masud Mollah, Md. Nurul Haque Biomed Res Int Research Article The naïve Bayes classifier (NBC) is one of the most popular classifiers for class prediction or pattern recognition from microarray gene expression data (MGED). However, it is very much sensitive to outliers with the classical estimates of the location and scale parameters. It is one of the most important drawbacks for gene expression data analysis by the classical NBC. The gene expression dataset is often contaminated by outliers due to several steps involved in the data generating process from hybridization of DNA samples to image analysis. Therefore, in this paper, an attempt is made to robustify the Gaussian NBC by the minimum β-divergence method. The role of minimum β-divergence method in this article is to produce the robust estimators for the location and scale parameters based on the training dataset and outlier detection and modification in test dataset. The performance of the proposed method depends on the tuning parameter β. It reduces to the traditional naïve Bayes classifier when β → 0. We investigated the performance of the proposed beta naïve Bayes classifier (β-NBC) in a comparison with some popular existing classifiers (NBC, KNN, SVM, and AdaBoost) using both simulated and real gene expression datasets. We observed that the proposed method improved the performance over the others in presence of outliers. Otherwise, it keeps almost equal performance. Hindawi 2017 2017-08-07 /pmc/articles/PMC5564130/ /pubmed/28848763 http://dx.doi.org/10.1155/2017/3020627 Text en Copyright © 2017 Md. Shakil Ahmed et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ahmed, Md. Shakil
Shahjaman, Md.
Rana, Md. Masud
Mollah, Md. Nurul Haque
Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis
title Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis
title_full Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis
title_fullStr Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis
title_full_unstemmed Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis
title_short Robustification of Naïve Bayes Classifier and Its Application for Microarray Gene Expression Data Analysis
title_sort robustification of naïve bayes classifier and its application for microarray gene expression data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5564130/
https://www.ncbi.nlm.nih.gov/pubmed/28848763
http://dx.doi.org/10.1155/2017/3020627
work_keys_str_mv AT ahmedmdshakil robustificationofnaivebayesclassifieranditsapplicationformicroarraygeneexpressiondataanalysis
AT shahjamanmd robustificationofnaivebayesclassifieranditsapplicationformicroarraygeneexpressiondataanalysis
AT ranamdmasud robustificationofnaivebayesclassifieranditsapplicationformicroarraygeneexpressiondataanalysis
AT mollahmdnurulhaque robustificationofnaivebayesclassifieranditsapplicationformicroarraygeneexpressiondataanalysis