Cargando…
Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning
Software Fault Prediction (SFP) is an important process to detect the faulty components of the software to detect faulty classes or faulty modules early in the software development life cycle. In this paper, a machine learning framework is proposed for SFP. Initially, pre-processing and re-sampling...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9909674/ https://www.ncbi.nlm.nih.gov/pubmed/36785593 http://dx.doi.org/10.1007/s10489-022-04427-x |
_version_ | 1784884624441212928 |
---|---|
author | Mafarja, Majdi Thaher, Thaer Al-Betar, Mohammed Azmi Too, Jingwei Awadallah, Mohammed A. Abu Doush, Iyad Turabieh, Hamza |
author_facet | Mafarja, Majdi Thaher, Thaer Al-Betar, Mohammed Azmi Too, Jingwei Awadallah, Mohammed A. Abu Doush, Iyad Turabieh, Hamza |
author_sort | Mafarja, Majdi |
collection | PubMed |
description | Software Fault Prediction (SFP) is an important process to detect the faulty components of the software to detect faulty classes or faulty modules early in the software development life cycle. In this paper, a machine learning framework is proposed for SFP. Initially, pre-processing and re-sampling techniques are applied to make the SFP datasets ready to be used by ML techniques. Thereafter seven classifiers are compared, namely K-Nearest Neighbors (KNN), Naive Bayes (NB), Linear Discriminant Analysis (LDA), Linear Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF). The RF classifier outperforms all other classifiers in terms of eliminating irrelevant/redundant features. The performance of RF is improved further using a dimensionality reduction method called binary whale optimization algorithm (BWOA) to eliminate the irrelevant/redundant features. Finally, the performance of BWOA is enhanced by hybridizing the exploration strategies of the grey wolf optimizer (GWO) and harris hawks optimization (HHO) algorithms. The proposed method is called SBEWOA. The SFP datasets utilized are selected from the PROMISE repository using sixteen datasets for software projects with different sizes and complexity. The comparative evaluation against nine well-established feature selection methods proves that the proposed SBEWOA is able to significantly produce competitively superior results for several instances of the evaluated dataset. The algorithms’ performance is compared in terms of accuracy, the number of features, and fitness function. This is also proved by the 2-tailed P-values of the Wilcoxon signed ranks statistical test used. In conclusion, the proposed method is an efficient alternative ML method for SFP that can be used for similar problems in the software engineering domain. |
format | Online Article Text |
id | pubmed-9909674 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-99096742023-02-09 Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning Mafarja, Majdi Thaher, Thaer Al-Betar, Mohammed Azmi Too, Jingwei Awadallah, Mohammed A. Abu Doush, Iyad Turabieh, Hamza Appl Intell (Dordr) Article Software Fault Prediction (SFP) is an important process to detect the faulty components of the software to detect faulty classes or faulty modules early in the software development life cycle. In this paper, a machine learning framework is proposed for SFP. Initially, pre-processing and re-sampling techniques are applied to make the SFP datasets ready to be used by ML techniques. Thereafter seven classifiers are compared, namely K-Nearest Neighbors (KNN), Naive Bayes (NB), Linear Discriminant Analysis (LDA), Linear Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF). The RF classifier outperforms all other classifiers in terms of eliminating irrelevant/redundant features. The performance of RF is improved further using a dimensionality reduction method called binary whale optimization algorithm (BWOA) to eliminate the irrelevant/redundant features. Finally, the performance of BWOA is enhanced by hybridizing the exploration strategies of the grey wolf optimizer (GWO) and harris hawks optimization (HHO) algorithms. The proposed method is called SBEWOA. The SFP datasets utilized are selected from the PROMISE repository using sixteen datasets for software projects with different sizes and complexity. The comparative evaluation against nine well-established feature selection methods proves that the proposed SBEWOA is able to significantly produce competitively superior results for several instances of the evaluated dataset. The algorithms’ performance is compared in terms of accuracy, the number of features, and fitness function. This is also proved by the 2-tailed P-values of the Wilcoxon signed ranks statistical test used. In conclusion, the proposed method is an efficient alternative ML method for SFP that can be used for similar problems in the software engineering domain. Springer US 2023-02-09 /pmc/articles/PMC9909674/ /pubmed/36785593 http://dx.doi.org/10.1007/s10489-022-04427-x Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Mafarja, Majdi Thaher, Thaer Al-Betar, Mohammed Azmi Too, Jingwei Awadallah, Mohammed A. Abu Doush, Iyad Turabieh, Hamza Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning |
title | Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning |
title_full | Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning |
title_fullStr | Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning |
title_full_unstemmed | Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning |
title_short | Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning |
title_sort | classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9909674/ https://www.ncbi.nlm.nih.gov/pubmed/36785593 http://dx.doi.org/10.1007/s10489-022-04427-x |
work_keys_str_mv | AT mafarjamajdi classificationframeworkforfaultysoftwareusingenhancedexploratorywhaleoptimizerbasedfeatureselectionschemeandrandomforestensemblelearning AT thaherthaer classificationframeworkforfaultysoftwareusingenhancedexploratorywhaleoptimizerbasedfeatureselectionschemeandrandomforestensemblelearning AT albetarmohammedazmi classificationframeworkforfaultysoftwareusingenhancedexploratorywhaleoptimizerbasedfeatureselectionschemeandrandomforestensemblelearning AT toojingwei classificationframeworkforfaultysoftwareusingenhancedexploratorywhaleoptimizerbasedfeatureselectionschemeandrandomforestensemblelearning AT awadallahmohammeda classificationframeworkforfaultysoftwareusingenhancedexploratorywhaleoptimizerbasedfeatureselectionschemeandrandomforestensemblelearning AT abudoushiyad classificationframeworkforfaultysoftwareusingenhancedexploratorywhaleoptimizerbasedfeatureselectionschemeandrandomforestensemblelearning AT turabiehhamza classificationframeworkforfaultysoftwareusingenhancedexploratorywhaleoptimizerbasedfeatureselectionschemeandrandomforestensemblelearning |