Cargando…

Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval

Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilita...

Descripción completa

Detalles Bibliográficos
Autores principales: Hammad, Muhammad, Babur, Önder, Abdul Basit, Hamid, van den Brand, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8641571/
https://www.ncbi.nlm.nih.gov/pubmed/34909463
http://dx.doi.org/10.7717/peerj-cs.737
_version_ 1784609520705601536
author Hammad, Muhammad
Babur, Önder
Abdul Basit, Hamid
van den Brand, Mark
author_facet Hammad, Muhammad
Babur, Önder
Abdul Basit, Hamid
van den Brand, Mark
author_sort Hammad, Muhammad
collection PubMed
description Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilitate code clone reuse, we previously presented DeepClone, a novel deep learning approach for modeling code clones along with non-cloned code to predict the next set of tokens (possibly a complete clone method body) based on the code written so far. The probabilistic nature of language modeling, however, can lead to code output with minor syntax or logic errors. To resolve this, we propose a novel approach called Clone-Advisor. We apply an information retrieval technique on top of DeepClone output to recommend real clone methods closely matching the predicted clone method, thus improving the original output by DeepClone. In this paper we have discussed and refined our previous work on DeepClone in much more detail. Moreover, we have quantitatively evaluated the performance and effectiveness of Clone-Advisor in clone method recommendation.
format Online
Article
Text
id pubmed-8641571
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-86415712021-12-13 Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval Hammad, Muhammad Babur, Önder Abdul Basit, Hamid van den Brand, Mark PeerJ Comput Sci Data Mining and Machine Learning Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilitate code clone reuse, we previously presented DeepClone, a novel deep learning approach for modeling code clones along with non-cloned code to predict the next set of tokens (possibly a complete clone method body) based on the code written so far. The probabilistic nature of language modeling, however, can lead to code output with minor syntax or logic errors. To resolve this, we propose a novel approach called Clone-Advisor. We apply an information retrieval technique on top of DeepClone output to recommend real clone methods closely matching the predicted clone method, thus improving the original output by DeepClone. In this paper we have discussed and refined our previous work on DeepClone in much more detail. Moreover, we have quantitatively evaluated the performance and effectiveness of Clone-Advisor in clone method recommendation. PeerJ Inc. 2021-11-09 /pmc/articles/PMC8641571/ /pubmed/34909463 http://dx.doi.org/10.7717/peerj-cs.737 Text en © 2021 Hammad et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Data Mining and Machine Learning
Hammad, Muhammad
Babur, Önder
Abdul Basit, Hamid
van den Brand, Mark
Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval
title Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval
title_full Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval
title_fullStr Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval
title_full_unstemmed Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval
title_short Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval
title_sort clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval
topic Data Mining and Machine Learning
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8641571/
https://www.ncbi.nlm.nih.gov/pubmed/34909463
http://dx.doi.org/10.7717/peerj-cs.737
work_keys_str_mv AT hammadmuhammad cloneadvisorrecommendingcodetokensandclonemethodswithdeeplearningandinformationretrieval
AT baburonder cloneadvisorrecommendingcodetokensandclonemethodswithdeeplearningandinformationretrieval
AT abdulbasithamid cloneadvisorrecommendingcodetokensandclonemethodswithdeeplearningandinformationretrieval
AT vandenbrandmark cloneadvisorrecommendingcodetokensandclonemethodswithdeeplearningandinformationretrieval