Cargando…

Python code smells detection using conventional machine learning models

Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sandouka, Rana, Aljamaan, Hamoud
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2023
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280480/ https://www.ncbi.nlm.nih.gov/pubmed/37346528 http://dx.doi.org/10.7717/peerj-cs.1370

_version_	1785060803004596224
author	Sandouka, Rana Aljamaan, Hamoud
author_facet	Sandouka, Rana Aljamaan, Hamoud
author_sort	Sandouka, Rana
collection	PubMed
description	Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies focused on code smell detection using Java programming language code smell datasets. This article proposes a Python code smell dataset for Large Class and Long Method code smells. The built dataset contains 1,000 samples for each code smell, with 18 features extracted from the source code. Furthermore, we investigated the detection performance of six machine learning models as baselines in Python code smells detection. The baselines were evaluated based on Accuracy and Matthews correlation coefficient (MCC) measures. Results indicate the superiority of Random Forest ensemble in Python Large Class code smell detection by achieving the highest detection performance of 0.77 MCC rate, while decision tree was the best performing model in Python Long Method code smell detection by achieving the highest MCC Rate of 0.89.
format	Online Article Text
id	pubmed-10280480
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-102804802023-06-21 Python code smells detection using conventional machine learning models Sandouka, Rana Aljamaan, Hamoud PeerJ Comput Sci Artificial Intelligence Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies focused on code smell detection using Java programming language code smell datasets. This article proposes a Python code smell dataset for Large Class and Long Method code smells. The built dataset contains 1,000 samples for each code smell, with 18 features extracted from the source code. Furthermore, we investigated the detection performance of six machine learning models as baselines in Python code smells detection. The baselines were evaluated based on Accuracy and Matthews correlation coefficient (MCC) measures. Results indicate the superiority of Random Forest ensemble in Python Large Class code smell detection by achieving the highest detection performance of 0.77 MCC rate, while decision tree was the best performing model in Python Long Method code smell detection by achieving the highest MCC Rate of 0.89. PeerJ Inc. 2023-05-29 /pmc/articles/PMC10280480/ /pubmed/37346528 http://dx.doi.org/10.7717/peerj-cs.1370 Text en ©2023 Sandouka and Aljamaan https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Artificial Intelligence Sandouka, Rana Aljamaan, Hamoud Python code smells detection using conventional machine learning models
title	Python code smells detection using conventional machine learning models
title_full	Python code smells detection using conventional machine learning models
title_fullStr	Python code smells detection using conventional machine learning models
title_full_unstemmed	Python code smells detection using conventional machine learning models
title_short	Python code smells detection using conventional machine learning models
title_sort	python code smells detection using conventional machine learning models
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280480/ https://www.ncbi.nlm.nih.gov/pubmed/37346528 http://dx.doi.org/10.7717/peerj-cs.1370
work_keys_str_mv	AT sandoukarana pythoncodesmellsdetectionusingconventionalmachinelearningmodels AT aljamaanhamoud pythoncodesmellsdetectionusingconventionalmachinelearningmodels

Python code smells detection using conventional machine learning models

Ejemplares similares