结构引起的变压器的代码摘要

论文标题

结构引起的变压器的代码摘要

Code Summarization with Structure-induced Transformer

论文作者

Wu, Hongqiu, Zhao, Hai, Zhang, Min

论文摘要

代码摘要（CS）正在成为最近语言理解中的一个有前途的领域，该领域旨在以源代码的形式自动生成明智的人类语言，以自动为编程语言生成，以最方便地开发程序员。众所周知，编程语言是高度结构化的。因此，先前的工作试图应用基于结构的遍历（SBT）或非序列模型，例如TREEL-LSTM和Graph Neural Network（GNN）来学习结构程序语义。但是，令人惊讶的是，将SBT纳入诸如变压器之类的高级编码器中，而不是LSTM，这表明没有性能增益，这使GNN成为唯一的休息意味着建模源代码中这种必要的结构线索。为了释放这种不便，我们提出了结构诱导的变压器，该变压器用新提出的结构诱导的自我发场机制来编码具有多视图结构线索的顺序代码输入。广泛的实验表明，我们提出的结构诱导的变压器有助于在基准上获得新的最新结果。

Code summarization (CS) is becoming a promising area in recent language understanding, which aims to generate sensible human language automatically for programming language in the format of source code, serving in the most convenience of programmer developing. It is well known that programming languages are highly structured. Thus previous works attempt to apply structure-based traversal (SBT) or non-sequential models like Tree-LSTM and graph neural network (GNN) to learn structural program semantics. However, it is surprising that incorporating SBT into advanced encoder like Transformer instead of LSTM has been shown no performance gain, which lets GNN become the only rest means modeling such necessary structural clue in source code. To release such inconvenience, we propose structure-induced Transformer, which encodes sequential code inputs with multi-view structural clues in terms of a newly-proposed structure-induced self-attention mechanism. Extensive experiments show that our proposed structure-induced Transformer helps achieve new state-of-the-art results on benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题