Cargando…
Hybrid Feature-Learning-Based PSO-PCA Feature Engineering Approach for Blood Cancer Classification
Acute lymphoblastic leukemia (ALL) is a lethal blood cancer that is characterized by an abnormal increased number of immature lymphocytes in the blood or bone marrow. For effective treatment of ALL, early assessment of the disease is essential. Manual examination of stained blood smear images is cur...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453878/ https://www.ncbi.nlm.nih.gov/pubmed/37627931 http://dx.doi.org/10.3390/diagnostics13162672 |
_version_ | 1785096052058095616 |
---|---|
author | Atteia, Ghada Alnashwan, Rana Hassan, Malak |
author_facet | Atteia, Ghada Alnashwan, Rana Hassan, Malak |
author_sort | Atteia, Ghada |
collection | PubMed |
description | Acute lymphoblastic leukemia (ALL) is a lethal blood cancer that is characterized by an abnormal increased number of immature lymphocytes in the blood or bone marrow. For effective treatment of ALL, early assessment of the disease is essential. Manual examination of stained blood smear images is current practice for initially screening ALL. This practice is time-consuming and error-prone. In order to effectively diagnose ALL, numerous deep-learning-based computer vision systems have been developed for detecting ALL in blood peripheral images (BPIs). Such systems extract a huge number of image features and use them to perform the classification task. The extracted features may contain irrelevant or redundant features that could reduce classification accuracy and increase the running time of the classifier. Feature selection is considered an effective tool to mitigate the curse of the dimensionality problem and alleviate its corresponding shortcomings. One of the most effective dimensionality-reduction tools is principal component analysis (PCA), which maps input features into an orthogonal space and extracts the features that convey the highest variability from the data. Other feature selection approaches utilize evolutionary computation (EC) to search the feature space and localize optimal features. To profit from both feature selection approaches in improving the classification performance of ALL, in this study, a new hybrid deep-learning-based feature engineering approach is proposed. The introduced approach integrates the powerful capability of PCA and particle swarm optimization (PSO) approaches in selecting informative features from BPI mages with the power of pre-trained CNNs of feature extraction. Image features are first extracted through the feature-transfer capability of the GoogleNet convolutional neural network (CNN). PCA is utilized to generate a feature set of the principal components that covers 95% of the variability in the data. In parallel, bio-inspired particle swarm optimization is used to search for the optimal image features. The PCA and PSO-derived feature sets are then integrated to develop a hybrid set of features that are then used to train a Bayesian-based optimized support vector machine (SVM) and subspace discriminant ensemble-learning (SDEL) classifiers. The obtained results show improved classification performance for the ML classifiers trained by the proposed hybrid feature set over the original PCA, PSO, and all extracted feature sets for ALL multi-class classification. The Bayesian-optimized SVM trained with the proposed hybrid PCA-PSO feature set achieves the highest classification accuracy of 97.4%. The classification performance of the proposed feature engineering approach competes with the state of the art. |
format | Online Article Text |
id | pubmed-10453878 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-104538782023-08-26 Hybrid Feature-Learning-Based PSO-PCA Feature Engineering Approach for Blood Cancer Classification Atteia, Ghada Alnashwan, Rana Hassan, Malak Diagnostics (Basel) Article Acute lymphoblastic leukemia (ALL) is a lethal blood cancer that is characterized by an abnormal increased number of immature lymphocytes in the blood or bone marrow. For effective treatment of ALL, early assessment of the disease is essential. Manual examination of stained blood smear images is current practice for initially screening ALL. This practice is time-consuming and error-prone. In order to effectively diagnose ALL, numerous deep-learning-based computer vision systems have been developed for detecting ALL in blood peripheral images (BPIs). Such systems extract a huge number of image features and use them to perform the classification task. The extracted features may contain irrelevant or redundant features that could reduce classification accuracy and increase the running time of the classifier. Feature selection is considered an effective tool to mitigate the curse of the dimensionality problem and alleviate its corresponding shortcomings. One of the most effective dimensionality-reduction tools is principal component analysis (PCA), which maps input features into an orthogonal space and extracts the features that convey the highest variability from the data. Other feature selection approaches utilize evolutionary computation (EC) to search the feature space and localize optimal features. To profit from both feature selection approaches in improving the classification performance of ALL, in this study, a new hybrid deep-learning-based feature engineering approach is proposed. The introduced approach integrates the powerful capability of PCA and particle swarm optimization (PSO) approaches in selecting informative features from BPI mages with the power of pre-trained CNNs of feature extraction. Image features are first extracted through the feature-transfer capability of the GoogleNet convolutional neural network (CNN). PCA is utilized to generate a feature set of the principal components that covers 95% of the variability in the data. In parallel, bio-inspired particle swarm optimization is used to search for the optimal image features. The PCA and PSO-derived feature sets are then integrated to develop a hybrid set of features that are then used to train a Bayesian-based optimized support vector machine (SVM) and subspace discriminant ensemble-learning (SDEL) classifiers. The obtained results show improved classification performance for the ML classifiers trained by the proposed hybrid feature set over the original PCA, PSO, and all extracted feature sets for ALL multi-class classification. The Bayesian-optimized SVM trained with the proposed hybrid PCA-PSO feature set achieves the highest classification accuracy of 97.4%. The classification performance of the proposed feature engineering approach competes with the state of the art. MDPI 2023-08-14 /pmc/articles/PMC10453878/ /pubmed/37627931 http://dx.doi.org/10.3390/diagnostics13162672 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Atteia, Ghada Alnashwan, Rana Hassan, Malak Hybrid Feature-Learning-Based PSO-PCA Feature Engineering Approach for Blood Cancer Classification |
title | Hybrid Feature-Learning-Based PSO-PCA Feature Engineering Approach for Blood Cancer Classification |
title_full | Hybrid Feature-Learning-Based PSO-PCA Feature Engineering Approach for Blood Cancer Classification |
title_fullStr | Hybrid Feature-Learning-Based PSO-PCA Feature Engineering Approach for Blood Cancer Classification |
title_full_unstemmed | Hybrid Feature-Learning-Based PSO-PCA Feature Engineering Approach for Blood Cancer Classification |
title_short | Hybrid Feature-Learning-Based PSO-PCA Feature Engineering Approach for Blood Cancer Classification |
title_sort | hybrid feature-learning-based pso-pca feature engineering approach for blood cancer classification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453878/ https://www.ncbi.nlm.nih.gov/pubmed/37627931 http://dx.doi.org/10.3390/diagnostics13162672 |
work_keys_str_mv | AT atteiaghada hybridfeaturelearningbasedpsopcafeatureengineeringapproachforbloodcancerclassification AT alnashwanrana hybridfeaturelearningbasedpsopcafeatureengineeringapproachforbloodcancerclassification AT hassanmalak hybridfeaturelearningbasedpsopcafeatureengineeringapproachforbloodcancerclassification |