论文标题
具有代码表示的学习计划语义:一项实证研究
Learning Program Semantics with Code Representations: An Empirical Study
论文作者
论文摘要
程序语义学习是各种代码智能任务的核心和基础,例如漏洞检测,克隆检测。现有的大量作品提出了多种方法来学习针对不同任务的程序语义,这些作品已经达到了最新的表现。但是,目前,仍然错过了一项关于评估不同任务跨不同计划表示技术的全面和系统的研究。 从这个起点,在本文中,我们进行了一项经验研究,以评估不同的程序表示技术。具体而言,我们将当前主流代码表示技术分为四个类别,即基于特征的,基于序列的,基于树的,基于图的程序和基于图的程序表示技术,并评估其在三个多样的和流行的代码智能任务上的性能,即{code Crance},漏洞检测{code Crance},脆弱性检测,以及在公共发布的公共发布的Benchmark上检测到Clone。我们进一步设计了三个{研究问题(RQS)}并进行全面分析以调查性能。通过广泛的实验结果,我们得出结论,(1)基于图的表示优于这些任务中其他选定的技术。 (2)与基于树和基于图的表示中使用的节点类型信息相比,节点文本信息对于学习程序语义更为重要。 (3)不同的任务要求特定于任务的语义才能实现其最高性能,但是结合了来自不同维度(例如控制依赖性)的各种程序语义,数据依赖性仍然可以产生有希望的结果。
Program semantics learning is the core and fundamental for various code intelligent tasks e.g., vulnerability detection, clone detection. A considerable amount of existing works propose diverse approaches to learn the program semantics for different tasks and these works have achieved state-of-the-art performance. However, currently, a comprehensive and systematic study on evaluating different program representation techniques across diverse tasks is still missed. From this starting point, in this paper, we conduct an empirical study to evaluate different program representation techniques. Specifically, we categorize current mainstream code representation techniques into four categories i.e., Feature-based, Sequence-based, Tree-based, and Graph-based program representation technique and evaluate its performance on three diverse and popular code intelligent tasks i.e., {Code Classification}, Vulnerability Detection, and Clone Detection on the public released benchmark. We further design three {research questions (RQs)} and conduct a comprehensive analysis to investigate the performance. By the extensive experimental results, we conclude that (1) The graph-based representation is superior to the other selected techniques across these tasks. (2) Compared with the node type information used in tree-based and graph-based representations, the node textual information is more critical to learning the program semantics. (3) Different tasks require the task-specific semantics to achieve their highest performance, however combining various program semantics from different dimensions such as control dependency, data dependency can still produce promising results.