论文标题
关于在单位测试水平上预测变质关系的复制研究
A Replication Study on Predicting Metamorphic Relations at Unit Testing Level
论文作者
论文摘要
变质测试(MT)通过检查测试执行的输入和输出之间的关系来解决测试甲骨文问题。这种关系称为变质关系(MRS)。在当前实践中,识别和选择合适的MRS通常是一项具有挑战性的手动任务,需要对SUT及其应用领域进行彻底掌握。因此,Kanewala等人。提出了预测的变质关系(PMR)方法自动提出了从六个预定的MRS列表中的MRS,用于测试新开发的方法。 PMR基于对从100 Java方法的控制流图(CFG)提取的特征训练的分类模型。在我们的复制研究中,我们探讨了PMR的普遍性。首先,我们重建整个预处理和训练管道,并重复重复重复原始研究,以验证报告的结果并为进一步的实验建立基础。其次,我们执行概念上的复制,以探索在Python和C ++中实现的功能相同方法的第一步中对Java方法训练的PMR模型的可重复性。最后,我们从python和C ++方法中重新探讨CFGS上的模型,以研究对编程语言和实现细节的依赖性。我们能够成功复制原始研究,从而为Java方法取得了可比的结果。但是,尽管仅使用CFG功能从语言详细信息中抽象来抽象,但应用于功能等效的Python和C ++方法时,基于JAVA的分类器的预测性能会显着降低。由于当分类器在Python和C ++编写的方法的CFG上重新训练时,该性能再次改善,我们得出结论,PMR方法可以概括,但是只有当分类器从使用的编程语言中的代码伪像开始开发时。
Metamorphic Testing (MT) addresses the test oracle problem by examining the relations between inputs and outputs of test executions. Such relations are known as Metamorphic Relations (MRs). In current practice, identifying and selecting suitable MRs is usually a challenging manual task, requiring a thorough grasp of the SUT and its application domain. Thus, Kanewala et al. proposed the Predicting Metamorphic Relations (PMR) approach to automatically suggest MRs from a list of six pre-defined MRs for testing newly developed methods. PMR is based on a classification model trained on features extracted from the control-flow graph (CFG) of 100 Java methods. In our replication study, we explore the generalizability of PMR. First, we rebuild the entire preprocessing and training pipeline and repeat the original study in a close replication to verify the reported results and establish the basis for further experiments. Second, we perform a conceptual replication to explore the reusability of the PMR model trained on CFGs from Java methods in the first step for functionally identical methods implemented in Python and C++. Finally, we retrain the model on the CFGs from the Python and C++ methods to investigate the dependence on programming language and implementation details. We were able to successfully replicate the original study achieving comparable results for the Java methods set. However, the prediction performance of the Java-based classifiers significantly decreases when applied to functionally equivalent Python and C++ methods despite using only CFG features to abstract from language details. Since the performance improved again when the classifiers were retrained on the CFGs of the methods written in Python and C++, we conclude that the PMR approach can be generalized, but only when classifiers are developed starting from code artefacts in the used programming language.