Cargando…

A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods

An active research area where the experts from the medical field are trying to envisage the problem with more accuracy is diabetes prediction. Surveys conducted by WHO have shown a remarkable increase in the diabetic patients. Diabetes generally remains in dormant mode and it boosts the other diseas...

Descripción completa

Detalles Bibliográficos
Autores principales: Saxena, Roshi, Sharma, Sanjay Kumar, Gupta, Manali, Sampada, G. C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9033325/
https://www.ncbi.nlm.nih.gov/pubmed/35463255
http://dx.doi.org/10.1155/2022/3820360
_version_ 1784692860521545728
author Saxena, Roshi
Sharma, Sanjay Kumar
Gupta, Manali
Sampada, G. C.
author_facet Saxena, Roshi
Sharma, Sanjay Kumar
Gupta, Manali
Sampada, G. C.
author_sort Saxena, Roshi
collection PubMed
description An active research area where the experts from the medical field are trying to envisage the problem with more accuracy is diabetes prediction. Surveys conducted by WHO have shown a remarkable increase in the diabetic patients. Diabetes generally remains in dormant mode and it boosts the other diseases if patients are diagnosed with some other disease such as damage to the kidney vessels, problems in retina of the eye, and cardiac problem; if unidentified, it can create metabolic disorders and too many complications in the body. The main objective of our study is to draw a comparative study of different classifiers and feature selection methods to predict the diabetes with greater accuracy. In this paper, we have studied multilayer perceptron, decision trees, K-nearest neighbour, and random forest classifiers and few feature selection techniques were applied on the classifiers to detect the diabetes at an early stage. Raw data is subjected to preprocessing techniques, thus removing outliers and imputing missing values by mean and then in the end hyperparameters optimization. Experiments were conducted on PIMA Indians diabetes dataset using Weka 3.9 and the accuracy achieved for multilayer perceptron is 77.60%, for decision trees is 76.07%, for K-nearest neighbour is 78.58%, and for random forest is 79.8%, which is by far the best accuracy for random forest classifier.
format Online
Article
Text
id pubmed-9033325
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-90333252022-04-23 A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods Saxena, Roshi Sharma, Sanjay Kumar Gupta, Manali Sampada, G. C. Comput Intell Neurosci Research Article An active research area where the experts from the medical field are trying to envisage the problem with more accuracy is diabetes prediction. Surveys conducted by WHO have shown a remarkable increase in the diabetic patients. Diabetes generally remains in dormant mode and it boosts the other diseases if patients are diagnosed with some other disease such as damage to the kidney vessels, problems in retina of the eye, and cardiac problem; if unidentified, it can create metabolic disorders and too many complications in the body. The main objective of our study is to draw a comparative study of different classifiers and feature selection methods to predict the diabetes with greater accuracy. In this paper, we have studied multilayer perceptron, decision trees, K-nearest neighbour, and random forest classifiers and few feature selection techniques were applied on the classifiers to detect the diabetes at an early stage. Raw data is subjected to preprocessing techniques, thus removing outliers and imputing missing values by mean and then in the end hyperparameters optimization. Experiments were conducted on PIMA Indians diabetes dataset using Weka 3.9 and the accuracy achieved for multilayer perceptron is 77.60%, for decision trees is 76.07%, for K-nearest neighbour is 78.58%, and for random forest is 79.8%, which is by far the best accuracy for random forest classifier. Hindawi 2022-04-15 /pmc/articles/PMC9033325/ /pubmed/35463255 http://dx.doi.org/10.1155/2022/3820360 Text en Copyright © 2022 Roshi Saxena et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Saxena, Roshi
Sharma, Sanjay Kumar
Gupta, Manali
Sampada, G. C.
A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods
title A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods
title_full A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods
title_fullStr A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods
title_full_unstemmed A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods
title_short A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods
title_sort novel approach for feature selection and classification of diabetes mellitus: machine learning methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9033325/
https://www.ncbi.nlm.nih.gov/pubmed/35463255
http://dx.doi.org/10.1155/2022/3820360
work_keys_str_mv AT saxenaroshi anovelapproachforfeatureselectionandclassificationofdiabetesmellitusmachinelearningmethods
AT sharmasanjaykumar anovelapproachforfeatureselectionandclassificationofdiabetesmellitusmachinelearningmethods
AT guptamanali anovelapproachforfeatureselectionandclassificationofdiabetesmellitusmachinelearningmethods
AT sampadagc anovelapproachforfeatureselectionandclassificationofdiabetesmellitusmachinelearningmethods
AT saxenaroshi novelapproachforfeatureselectionandclassificationofdiabetesmellitusmachinelearningmethods
AT sharmasanjaykumar novelapproachforfeatureselectionandclassificationofdiabetesmellitusmachinelearningmethods
AT guptamanali novelapproachforfeatureselectionandclassificationofdiabetesmellitusmachinelearningmethods
AT sampadagc novelapproachforfeatureselectionandclassificationofdiabetesmellitusmachinelearningmethods