Cargando…

An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction

Feature selection is known to be an applicable solution to address the problem of high dimensionality in software defect prediction (SDP). However, choosing an appropriate filter feature selection (FFS) method that will generate and guarantee optimal features in SDP is an open research issue, known...

Descripción completa

Detalles Bibliográficos
Autores principales: Balogun, Abdullateef O., Basri, Shuib, Capretz, Luiz Fernando, Mahamad, Saipunidzam, Imam, Abdullahi A., Almomani, Malek A., Adeyemo, Victor E., Kumar, Ganesh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8535152/
https://www.ncbi.nlm.nih.gov/pubmed/34681999
http://dx.doi.org/10.3390/e23101274
_version_ 1784587709948362752
author Balogun, Abdullateef O.
Basri, Shuib
Capretz, Luiz Fernando
Mahamad, Saipunidzam
Imam, Abdullahi A.
Almomani, Malek A.
Adeyemo, Victor E.
Kumar, Ganesh
author_facet Balogun, Abdullateef O.
Basri, Shuib
Capretz, Luiz Fernando
Mahamad, Saipunidzam
Imam, Abdullahi A.
Almomani, Malek A.
Adeyemo, Victor E.
Kumar, Ganesh
author_sort Balogun, Abdullateef O.
collection PubMed
description Feature selection is known to be an applicable solution to address the problem of high dimensionality in software defect prediction (SDP). However, choosing an appropriate filter feature selection (FFS) method that will generate and guarantee optimal features in SDP is an open research issue, known as the filter rank selection problem. As a solution, the combination of multiple filter methods can alleviate the filter rank selection problem. In this study, a novel adaptive rank aggregation-based ensemble multi-filter feature selection (AREMFFS) method is proposed to resolve high dimensionality and filter rank selection problems in SDP. Specifically, the proposed AREMFFS method is based on assessing and combining the strengths of individual FFS methods by aggregating multiple rank lists in the generation and subsequent selection of top-ranked features to be used in the SDP process. The efficacy of the proposed AREMFFS method is evaluated with decision tree (DT) and naïve Bayes (NB) models on defect datasets from different repositories with diverse defect granularities. Findings from the experimental results indicated the superiority of AREMFFS over other baseline FFS methods that were evaluated, existing rank aggregation based multi-filter FS methods, and variants of AREMFFS as developed in this study. That is, the proposed AREMFFS method not only had a superior effect on prediction performances of SDP models but also outperformed baseline FS methods and existing rank aggregation based multi-filter FS methods. Therefore, this study recommends the combination of multiple FFS methods to utilize the strength of respective FFS methods and take advantage of filter–filter relationships in selecting optimal features for SDP processes.
format Online
Article
Text
id pubmed-8535152
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85351522021-10-23 An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction Balogun, Abdullateef O. Basri, Shuib Capretz, Luiz Fernando Mahamad, Saipunidzam Imam, Abdullahi A. Almomani, Malek A. Adeyemo, Victor E. Kumar, Ganesh Entropy (Basel) Article Feature selection is known to be an applicable solution to address the problem of high dimensionality in software defect prediction (SDP). However, choosing an appropriate filter feature selection (FFS) method that will generate and guarantee optimal features in SDP is an open research issue, known as the filter rank selection problem. As a solution, the combination of multiple filter methods can alleviate the filter rank selection problem. In this study, a novel adaptive rank aggregation-based ensemble multi-filter feature selection (AREMFFS) method is proposed to resolve high dimensionality and filter rank selection problems in SDP. Specifically, the proposed AREMFFS method is based on assessing and combining the strengths of individual FFS methods by aggregating multiple rank lists in the generation and subsequent selection of top-ranked features to be used in the SDP process. The efficacy of the proposed AREMFFS method is evaluated with decision tree (DT) and naïve Bayes (NB) models on defect datasets from different repositories with diverse defect granularities. Findings from the experimental results indicated the superiority of AREMFFS over other baseline FFS methods that were evaluated, existing rank aggregation based multi-filter FS methods, and variants of AREMFFS as developed in this study. That is, the proposed AREMFFS method not only had a superior effect on prediction performances of SDP models but also outperformed baseline FS methods and existing rank aggregation based multi-filter FS methods. Therefore, this study recommends the combination of multiple FFS methods to utilize the strength of respective FFS methods and take advantage of filter–filter relationships in selecting optimal features for SDP processes. MDPI 2021-09-29 /pmc/articles/PMC8535152/ /pubmed/34681999 http://dx.doi.org/10.3390/e23101274 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Balogun, Abdullateef O.
Basri, Shuib
Capretz, Luiz Fernando
Mahamad, Saipunidzam
Imam, Abdullahi A.
Almomani, Malek A.
Adeyemo, Victor E.
Kumar, Ganesh
An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction
title An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction
title_full An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction
title_fullStr An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction
title_full_unstemmed An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction
title_short An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction
title_sort adaptive rank aggregation-based ensemble multi-filter feature selection method in software defect prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8535152/
https://www.ncbi.nlm.nih.gov/pubmed/34681999
http://dx.doi.org/10.3390/e23101274
work_keys_str_mv AT balogunabdullateefo anadaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT basrishuib anadaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT capretzluizfernando anadaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT mahamadsaipunidzam anadaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT imamabdullahia anadaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT almomanimaleka anadaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT adeyemovictore anadaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT kumarganesh anadaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT balogunabdullateefo adaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT basrishuib adaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT capretzluizfernando adaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT mahamadsaipunidzam adaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT imamabdullahia adaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT almomanimaleka adaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT adeyemovictore adaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction
AT kumarganesh adaptiverankaggregationbasedensemblemultifilterfeatureselectionmethodinsoftwaredefectprediction