Cargando…

Software defect prediction using learning to rank approach

Software defect prediction (SDP) plays a significant role in detecting the most likely defective software modules and optimizing the allocation of testing resources. In practice, though, project managers must not only identify defective modules, but also rank them in a specific order to optimize the...

Descripción completa

Detalles Bibliográficos
Autores principales: Nassif, Ali Bou, Talib, Manar Abu, Azzeh, Mohammad, Alzaabi, Shaikha, Khanfar, Rawan, Kharsa, Ruba, Angelis, Lefteris
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622444/
https://www.ncbi.nlm.nih.gov/pubmed/37919406
http://dx.doi.org/10.1038/s41598-023-45915-5
_version_ 1785130543026798592
author Nassif, Ali Bou
Talib, Manar Abu
Azzeh, Mohammad
Alzaabi, Shaikha
Khanfar, Rawan
Kharsa, Ruba
Angelis, Lefteris
author_facet Nassif, Ali Bou
Talib, Manar Abu
Azzeh, Mohammad
Alzaabi, Shaikha
Khanfar, Rawan
Kharsa, Ruba
Angelis, Lefteris
author_sort Nassif, Ali Bou
collection PubMed
description Software defect prediction (SDP) plays a significant role in detecting the most likely defective software modules and optimizing the allocation of testing resources. In practice, though, project managers must not only identify defective modules, but also rank them in a specific order to optimize the resource allocation and minimize testing costs, especially for projects with limited budgets. This vital task can be accomplished using Learning to Rank (LTR) algorithm. This algorithm is a type of machine learning methodology that pursues two important tasks: prediction and learning. Although this algorithm is commonly used in information retrieval, it also presents high efficiency for other problems, like SDP. The LTR approach is mainly used in defect prediction to predict and rank the most likely buggy modules based on their bug count or bug density. This research paper conducts a comprehensive comparison study on the behavior of eight selected LTR models using two target variables: bug count and bug density. It also studies the effect of using imbalance learning and feature selection on the employed LTR models. The models are empirically evaluated using Fault Percentile Average. Our results show that using bug count as ranking criteria produces higher scores and more stable results across multiple experiment settings. Moreover, using imbalance learning has a positive impact for bug density, but on the other hand it leads to a negative impact for bug count. Lastly, using the feature selection does not show significant improvement for bug density, while there is no impact when bug count is used. Therefore, we conclude that using feature selection and imbalance learning with LTR does not come up with superior or significant results.
format Online
Article
Text
id pubmed-10622444
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106224442023-11-04 Software defect prediction using learning to rank approach Nassif, Ali Bou Talib, Manar Abu Azzeh, Mohammad Alzaabi, Shaikha Khanfar, Rawan Kharsa, Ruba Angelis, Lefteris Sci Rep Article Software defect prediction (SDP) plays a significant role in detecting the most likely defective software modules and optimizing the allocation of testing resources. In practice, though, project managers must not only identify defective modules, but also rank them in a specific order to optimize the resource allocation and minimize testing costs, especially for projects with limited budgets. This vital task can be accomplished using Learning to Rank (LTR) algorithm. This algorithm is a type of machine learning methodology that pursues two important tasks: prediction and learning. Although this algorithm is commonly used in information retrieval, it also presents high efficiency for other problems, like SDP. The LTR approach is mainly used in defect prediction to predict and rank the most likely buggy modules based on their bug count or bug density. This research paper conducts a comprehensive comparison study on the behavior of eight selected LTR models using two target variables: bug count and bug density. It also studies the effect of using imbalance learning and feature selection on the employed LTR models. The models are empirically evaluated using Fault Percentile Average. Our results show that using bug count as ranking criteria produces higher scores and more stable results across multiple experiment settings. Moreover, using imbalance learning has a positive impact for bug density, but on the other hand it leads to a negative impact for bug count. Lastly, using the feature selection does not show significant improvement for bug density, while there is no impact when bug count is used. Therefore, we conclude that using feature selection and imbalance learning with LTR does not come up with superior or significant results. Nature Publishing Group UK 2023-11-02 /pmc/articles/PMC10622444/ /pubmed/37919406 http://dx.doi.org/10.1038/s41598-023-45915-5 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Nassif, Ali Bou
Talib, Manar Abu
Azzeh, Mohammad
Alzaabi, Shaikha
Khanfar, Rawan
Kharsa, Ruba
Angelis, Lefteris
Software defect prediction using learning to rank approach
title Software defect prediction using learning to rank approach
title_full Software defect prediction using learning to rank approach
title_fullStr Software defect prediction using learning to rank approach
title_full_unstemmed Software defect prediction using learning to rank approach
title_short Software defect prediction using learning to rank approach
title_sort software defect prediction using learning to rank approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622444/
https://www.ncbi.nlm.nih.gov/pubmed/37919406
http://dx.doi.org/10.1038/s41598-023-45915-5
work_keys_str_mv AT nassifalibou softwaredefectpredictionusinglearningtorankapproach
AT talibmanarabu softwaredefectpredictionusinglearningtorankapproach
AT azzehmohammad softwaredefectpredictionusinglearningtorankapproach
AT alzaabishaikha softwaredefectpredictionusinglearningtorankapproach
AT khanfarrawan softwaredefectpredictionusinglearningtorankapproach
AT kharsaruba softwaredefectpredictionusinglearningtorankapproach
AT angelislefteris softwaredefectpredictionusinglearningtorankapproach