Cargando…
Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval
Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilita...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8641571/ https://www.ncbi.nlm.nih.gov/pubmed/34909463 http://dx.doi.org/10.7717/peerj-cs.737 |
_version_ | 1784609520705601536 |
---|---|
author | Hammad, Muhammad Babur, Önder Abdul Basit, Hamid van den Brand, Mark |
author_facet | Hammad, Muhammad Babur, Önder Abdul Basit, Hamid van den Brand, Mark |
author_sort | Hammad, Muhammad |
collection | PubMed |
description | Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilitate code clone reuse, we previously presented DeepClone, a novel deep learning approach for modeling code clones along with non-cloned code to predict the next set of tokens (possibly a complete clone method body) based on the code written so far. The probabilistic nature of language modeling, however, can lead to code output with minor syntax or logic errors. To resolve this, we propose a novel approach called Clone-Advisor. We apply an information retrieval technique on top of DeepClone output to recommend real clone methods closely matching the predicted clone method, thus improving the original output by DeepClone. In this paper we have discussed and refined our previous work on DeepClone in much more detail. Moreover, we have quantitatively evaluated the performance and effectiveness of Clone-Advisor in clone method recommendation. |
format | Online Article Text |
id | pubmed-8641571 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-86415712021-12-13 Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval Hammad, Muhammad Babur, Önder Abdul Basit, Hamid van den Brand, Mark PeerJ Comput Sci Data Mining and Machine Learning Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilitate code clone reuse, we previously presented DeepClone, a novel deep learning approach for modeling code clones along with non-cloned code to predict the next set of tokens (possibly a complete clone method body) based on the code written so far. The probabilistic nature of language modeling, however, can lead to code output with minor syntax or logic errors. To resolve this, we propose a novel approach called Clone-Advisor. We apply an information retrieval technique on top of DeepClone output to recommend real clone methods closely matching the predicted clone method, thus improving the original output by DeepClone. In this paper we have discussed and refined our previous work on DeepClone in much more detail. Moreover, we have quantitatively evaluated the performance and effectiveness of Clone-Advisor in clone method recommendation. PeerJ Inc. 2021-11-09 /pmc/articles/PMC8641571/ /pubmed/34909463 http://dx.doi.org/10.7717/peerj-cs.737 Text en © 2021 Hammad et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Data Mining and Machine Learning Hammad, Muhammad Babur, Önder Abdul Basit, Hamid van den Brand, Mark Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval |
title | Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval |
title_full | Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval |
title_fullStr | Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval |
title_full_unstemmed | Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval |
title_short | Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval |
title_sort | clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval |
topic | Data Mining and Machine Learning |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8641571/ https://www.ncbi.nlm.nih.gov/pubmed/34909463 http://dx.doi.org/10.7717/peerj-cs.737 |
work_keys_str_mv | AT hammadmuhammad cloneadvisorrecommendingcodetokensandclonemethodswithdeeplearningandinformationretrieval AT baburonder cloneadvisorrecommendingcodetokensandclonemethodswithdeeplearningandinformationretrieval AT abdulbasithamid cloneadvisorrecommendingcodetokensandclonemethodswithdeeplearningandinformationretrieval AT vandenbrandmark cloneadvisorrecommendingcodetokensandclonemethodswithdeeplearningandinformationretrieval |