Cargando…

Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy

Source code summarization focuses on generating qualified natural language descriptions of a code snippet (e.g., functionality, usage and version). In an actual development environment, descriptions of the code are missing or not consistent with the code due to human factors, which makes it difficul...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zeng, Jianhui, Qu, Zhiheng, Cai, Bo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10138082/ https://www.ncbi.nlm.nih.gov/pubmed/37190358 http://dx.doi.org/10.3390/e25040570

_version_	1785032622358921216
author	Zeng, Jianhui Qu, Zhiheng Cai, Bo
author_facet	Zeng, Jianhui Qu, Zhiheng Cai, Bo
author_sort	Zeng, Jianhui
collection	PubMed
description	Source code summarization focuses on generating qualified natural language descriptions of a code snippet (e.g., functionality, usage and version). In an actual development environment, descriptions of the code are missing or not consistent with the code due to human factors, which makes it difficult for developers to comprehend and conduct subsequent maintenance. Some existing methods generate summaries from the sequence information of code without considering the structural information. Recently, researchers have adopted the Graph Neural Networks (GNNs) to capture the structural information with modified Abstract Syntax Trees (ASTs) to comprehensively represent a source code, but the alignment method of the two information encoder is hard to decide. In this paper, we propose a source code summarization model named SSCS, a unified transformer-based encoder–decoder architecture, for capturing structural and sequence information. SSCS is designed upon a structure-induced transformer with three main novel improvements. SSCS captures the structural information in a multi-scale aspect with an adapted fusion strategy and adopts a hierarchical encoding strategy to capture the textual information from the perspective of the document. Moreover, SSCS utilizes a bidirectional decoder which generates a summary from opposite direction to balance the generation performance between prefix and suffix. We conduct experiments on two public Java and Python datasets to evaluate our method and the result show that SSCS outperforms the state-of-art code summarization methods.
format	Online Article Text
id	pubmed-10138082
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-101380822023-04-28 Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy Zeng, Jianhui Qu, Zhiheng Cai, Bo Entropy (Basel) Article Source code summarization focuses on generating qualified natural language descriptions of a code snippet (e.g., functionality, usage and version). In an actual development environment, descriptions of the code are missing or not consistent with the code due to human factors, which makes it difficult for developers to comprehend and conduct subsequent maintenance. Some existing methods generate summaries from the sequence information of code without considering the structural information. Recently, researchers have adopted the Graph Neural Networks (GNNs) to capture the structural information with modified Abstract Syntax Trees (ASTs) to comprehensively represent a source code, but the alignment method of the two information encoder is hard to decide. In this paper, we propose a source code summarization model named SSCS, a unified transformer-based encoder–decoder architecture, for capturing structural and sequence information. SSCS is designed upon a structure-induced transformer with three main novel improvements. SSCS captures the structural information in a multi-scale aspect with an adapted fusion strategy and adopts a hierarchical encoding strategy to capture the textual information from the perspective of the document. Moreover, SSCS utilizes a bidirectional decoder which generates a summary from opposite direction to balance the generation performance between prefix and suffix. We conduct experiments on two public Java and Python datasets to evaluate our method and the result show that SSCS outperforms the state-of-art code summarization methods. MDPI 2023-03-26 /pmc/articles/PMC10138082/ /pubmed/37190358 http://dx.doi.org/10.3390/e25040570 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zeng, Jianhui Qu, Zhiheng Cai, Bo Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy
title	Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy
title_full	Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy
title_fullStr	Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy
title_full_unstemmed	Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy
title_short	Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy
title_sort	structure and sequence aligned code summarization with prefix and suffix balanced strategy
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10138082/ https://www.ncbi.nlm.nih.gov/pubmed/37190358 http://dx.doi.org/10.3390/e25040570
work_keys_str_mv	AT zengjianhui structureandsequencealignedcodesummarizationwithprefixandsuffixbalancedstrategy AT quzhiheng structureandsequencealignedcodesummarizationwithprefixandsuffixbalancedstrategy AT caibo structureandsequencealignedcodesummarizationwithprefixandsuffixbalancedstrategy

Structure and Sequence Aligned Code Summarization with Prefix and Suffix Balanced Strategy

Ejemplares similares