Cargando…

Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies

Source Code Generation (SCG) is a prevalent research field in the automation software engineering sector that maps specific descriptions to various sorts of executable code. Along with the numerous intensive studies, diverse SCG types that integrate different scenarios and contexts continue to emerg...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, Chen, Liu, Yan, Yin, Changqing
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8470470/ https://www.ncbi.nlm.nih.gov/pubmed/34573800 http://dx.doi.org/10.3390/e23091174

_version_	1784574207719374848
author	Yang, Chen Liu, Yan Yin, Changqing
author_facet	Yang, Chen Liu, Yan Yin, Changqing
author_sort	Yang, Chen
collection	PubMed
description	Source Code Generation (SCG) is a prevalent research field in the automation software engineering sector that maps specific descriptions to various sorts of executable code. Along with the numerous intensive studies, diverse SCG types that integrate different scenarios and contexts continue to emerge. As the ultimate purpose of SCG, Natural Language-based Source Code Generation (NLSCG) is growing into an attractive and challenging field, as the expressibility and extremely high abstraction of the input end. The booming large-scale dataset generated by open-source code repositories and Q&A resources, the innovation of machine learning algorithms, and the development of computing capacity make the NLSCG field promising and give more opportunities to the model implementation and perfection. Besides, we observed an increasing interest stream of NLSCG relevant studies recently, presenting quite various technical schools. However, many studies are bound to specific datasets with customization issues, producing occasional successful solutions with tentative technical methods. There is no systematic study to explore and promote the further development of this field. We carried out a systematic literature survey and tool research to find potential improvement directions. First, we position the role of NLSCG among various SCG genres, and specify the generation context empirically via software development domain knowledge and programming experiences; second, we explore the selected studies collected by a thoughtfully designed snowballing process, clarify the NLSCG field and understand the NLSCG problem, which lays a foundation for our subsequent investigation. Third, we model the research problems from technical focus and adaptive challenges, and elaborate insights gained from the NLSCG research backlog. Finally, we summarize the latest technology landscape over the transformation model and depict the critical tactics used in the essential components and their correlations. This research addresses the challenges of bridging the gap between natural language processing and source code analytics, outlines different dimensions of NLSCG research concerns and technical utilities, and shows a bounded technical context of NLSCG to facilitate more future studies in this promising area.
format	Online Article Text
id	pubmed-8470470
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-84704702021-09-27 Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies Yang, Chen Liu, Yan Yin, Changqing Entropy (Basel) Article Source Code Generation (SCG) is a prevalent research field in the automation software engineering sector that maps specific descriptions to various sorts of executable code. Along with the numerous intensive studies, diverse SCG types that integrate different scenarios and contexts continue to emerge. As the ultimate purpose of SCG, Natural Language-based Source Code Generation (NLSCG) is growing into an attractive and challenging field, as the expressibility and extremely high abstraction of the input end. The booming large-scale dataset generated by open-source code repositories and Q&A resources, the innovation of machine learning algorithms, and the development of computing capacity make the NLSCG field promising and give more opportunities to the model implementation and perfection. Besides, we observed an increasing interest stream of NLSCG relevant studies recently, presenting quite various technical schools. However, many studies are bound to specific datasets with customization issues, producing occasional successful solutions with tentative technical methods. There is no systematic study to explore and promote the further development of this field. We carried out a systematic literature survey and tool research to find potential improvement directions. First, we position the role of NLSCG among various SCG genres, and specify the generation context empirically via software development domain knowledge and programming experiences; second, we explore the selected studies collected by a thoughtfully designed snowballing process, clarify the NLSCG field and understand the NLSCG problem, which lays a foundation for our subsequent investigation. Third, we model the research problems from technical focus and adaptive challenges, and elaborate insights gained from the NLSCG research backlog. Finally, we summarize the latest technology landscape over the transformation model and depict the critical tactics used in the essential components and their correlations. This research addresses the challenges of bridging the gap between natural language processing and source code analytics, outlines different dimensions of NLSCG research concerns and technical utilities, and shows a bounded technical context of NLSCG to facilitate more future studies in this promising area. MDPI 2021-09-07 /pmc/articles/PMC8470470/ /pubmed/34573800 http://dx.doi.org/10.3390/e23091174 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Yang, Chen Liu, Yan Yin, Changqing Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies
title	Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies
title_full	Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies
title_fullStr	Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies
title_full_unstemmed	Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies
title_short	Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies
title_sort	recent advances in intelligent source code generation: a survey on natural language based studies
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8470470/ https://www.ncbi.nlm.nih.gov/pubmed/34573800 http://dx.doi.org/10.3390/e23091174
work_keys_str_mv	AT yangchen recentadvancesinintelligentsourcecodegenerationasurveyonnaturallanguagebasedstudies AT liuyan recentadvancesinintelligentsourcecodegenerationasurveyonnaturallanguagebasedstudies AT yinchangqing recentadvancesinintelligentsourcecodegenerationasurveyonnaturallanguagebasedstudies

Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies

Ejemplares similares