Cargando…

Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner

A computer-aided diagnosis (CAD) system that employs a super learner to diagnose the presence or absence of a disease has been developed. Each clinical dataset is preprocessed and split into training set (60%) and testing set (40%). A wrapper approach that uses three bioinspired algorithms, namely,...

Descripción completa

Detalles Bibliográficos
Autores principales: Murugesan, S., Bhuvaneswaran, R. S., Khanna Nehemiah, H., Keerthana Sankari, S., Nancy Jane, Y.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8149240/
https://www.ncbi.nlm.nih.gov/pubmed/34055041
http://dx.doi.org/10.1155/2021/6662420
_version_ 1783697921649147904
author Murugesan, S.
Bhuvaneswaran, R. S.
Khanna Nehemiah, H.
Keerthana Sankari, S.
Nancy Jane, Y.
author_facet Murugesan, S.
Bhuvaneswaran, R. S.
Khanna Nehemiah, H.
Keerthana Sankari, S.
Nancy Jane, Y.
author_sort Murugesan, S.
collection PubMed
description A computer-aided diagnosis (CAD) system that employs a super learner to diagnose the presence or absence of a disease has been developed. Each clinical dataset is preprocessed and split into training set (60%) and testing set (40%). A wrapper approach that uses three bioinspired algorithms, namely, cat swarm optimization (CSO), krill herd (KH) ,and bacterial foraging optimization (BFO) with the classification accuracy of support vector machine (SVM) as the fitness function has been used for feature selection. The selected features of each bioinspired algorithm are stored in three separate databases. The features selected by each bioinspired algorithm are used to train three back propagation neural networks (BPNN) independently using the conjugate gradient algorithm (CGA). Classifier testing is performed by using the testing set on each trained classifier, and the diagnostic results obtained are used to evaluate the performance of each classifier. The classification results obtained for each instance of the testing set of the three classifiers and the class label associated with each instance of the testing set will be the candidate instances for training and testing the super learner. The training set comprises of 80% of the instances, and the testing set comprises of 20% of the instances. Experimentation has been carried out using seven clinical datasets from the University of California Irvine (UCI) machine learning repository. The super learner has achieved a classification accuracy of 96.83% for Wisconsin diagnostic breast cancer dataset (WDBC), 86.36% for Statlog heart disease dataset (SHD), 94.74% for hepatocellular carcinoma dataset (HCC), 90.48% for hepatitis dataset (HD), 81.82% for vertebral column dataset (VCD), 84% for Cleveland heart disease dataset (CHD), and 70% for Indian liver patient dataset (ILP).
format Online
Article
Text
id pubmed-8149240
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-81492402021-05-27 Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner Murugesan, S. Bhuvaneswaran, R. S. Khanna Nehemiah, H. Keerthana Sankari, S. Nancy Jane, Y. Comput Math Methods Med Research Article A computer-aided diagnosis (CAD) system that employs a super learner to diagnose the presence or absence of a disease has been developed. Each clinical dataset is preprocessed and split into training set (60%) and testing set (40%). A wrapper approach that uses three bioinspired algorithms, namely, cat swarm optimization (CSO), krill herd (KH) ,and bacterial foraging optimization (BFO) with the classification accuracy of support vector machine (SVM) as the fitness function has been used for feature selection. The selected features of each bioinspired algorithm are stored in three separate databases. The features selected by each bioinspired algorithm are used to train three back propagation neural networks (BPNN) independently using the conjugate gradient algorithm (CGA). Classifier testing is performed by using the testing set on each trained classifier, and the diagnostic results obtained are used to evaluate the performance of each classifier. The classification results obtained for each instance of the testing set of the three classifiers and the class label associated with each instance of the testing set will be the candidate instances for training and testing the super learner. The training set comprises of 80% of the instances, and the testing set comprises of 20% of the instances. Experimentation has been carried out using seven clinical datasets from the University of California Irvine (UCI) machine learning repository. The super learner has achieved a classification accuracy of 96.83% for Wisconsin diagnostic breast cancer dataset (WDBC), 86.36% for Statlog heart disease dataset (SHD), 94.74% for hepatocellular carcinoma dataset (HCC), 90.48% for hepatitis dataset (HD), 81.82% for vertebral column dataset (VCD), 84% for Cleveland heart disease dataset (CHD), and 70% for Indian liver patient dataset (ILP). Hindawi 2021-05-17 /pmc/articles/PMC8149240/ /pubmed/34055041 http://dx.doi.org/10.1155/2021/6662420 Text en Copyright © 2021 S. Murugesan et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Murugesan, S.
Bhuvaneswaran, R. S.
Khanna Nehemiah, H.
Keerthana Sankari, S.
Nancy Jane, Y.
Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner
title Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner
title_full Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner
title_fullStr Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner
title_full_unstemmed Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner
title_short Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner
title_sort feature selection and classification of clinical datasets using bioinspired algorithms and super learner
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8149240/
https://www.ncbi.nlm.nih.gov/pubmed/34055041
http://dx.doi.org/10.1155/2021/6662420
work_keys_str_mv AT murugesans featureselectionandclassificationofclinicaldatasetsusingbioinspiredalgorithmsandsuperlearner
AT bhuvaneswaranrs featureselectionandclassificationofclinicaldatasetsusingbioinspiredalgorithmsandsuperlearner
AT khannanehemiahh featureselectionandclassificationofclinicaldatasetsusingbioinspiredalgorithmsandsuperlearner
AT keerthanasankaris featureselectionandclassificationofclinicaldatasetsusingbioinspiredalgorithmsandsuperlearner
AT nancyjaney featureselectionandclassificationofclinicaldatasetsusingbioinspiredalgorithmsandsuperlearner