论文标题
缺陷识别,分类和维修:更好地在一起
Defect Identification, Categorization, and Repair: Better Together
论文作者
论文摘要
即将到来的缺陷预测(JIT-DP)模型可以在入住时间识别诱导缺陷的提交。即使以前的研究取得了长足的进步,这些研究仍然存在以下局限性:1)没有充分使用有用信息(例如,语义信息和结构信息); 2)现有的工作只能预测一个提交作为越野车或清洁的工作,而没有更多有关其类型的缺陷的信息; 3)提交可能涉及许多文件的更改,这会导致难以定位缺陷; 4)先前的研究将缺陷识别和缺陷修复视为单独的任务,没有一个旨在同时处理这两个任务。 In this paper, to handle aforementioned limitations, we propose a comprehensive defect prediction and repair framework named CompDefect, which can identify whether a changed function (a more fine-grained level) is defect-prone, categorize the type of defect, and repair such a defect automatically if it falls into several scenarios, e.g., defects with single statement fixes, or those that match a small set of defect templates.通常,CompDefect中的前两个任务被视为多类分类任务,而最后一个任务被视为序列生成任务。 COMPDEFECT的全部输入由三个部分组成(用正函数进行检查):函数的干净版本(即引入缺陷之前的版本),函数的错误版本和函数的固定版本。在多类分类任务中,通过多类分类将缺陷类型与干净版本和Buggy版本中的信息分类。在代码序列生成任务中,CompDefect修复一旦识别出或保持不变的缺陷。
Just-In-Time defect prediction (JIT-DP) models can identify defect-inducing commits at check-in time. Even though previous studies have achieved a great progress, these studies still have the following limitations: 1) useful information (e.g., semantic information and structure information) are not fully used; 2) existing work can only predict a commit as buggy one or clean one without more information about what type of defect it is; 3) a commit may involve changes in many files, which cause difficulty in locating the defect; 4) prior studies treat defect identification and defect repair as separate tasks, none aims to handle both tasks simultaneously. In this paper, to handle aforementioned limitations, we propose a comprehensive defect prediction and repair framework named CompDefect, which can identify whether a changed function (a more fine-grained level) is defect-prone, categorize the type of defect, and repair such a defect automatically if it falls into several scenarios, e.g., defects with single statement fixes, or those that match a small set of defect templates. Generally, the first two tasks in CompDefect are treated as a multiclass classification task, while the last one is treated as a sequence generation task. The whole input of CompDefect consists of three parts (exampled with positive functions): the clean version of a function (i.e., the version before defect introduced), the buggy version of a function and the fixed version of a function. In multiclass classification task, CompDefect categorizes the type of defect via multiclass classification with the information in both the clean version and the buggy version. In code sequence generation task, CompDefect repairs the defect once identified or keeps it unchanged.