论文标题
松饼:通过神经体系结构测试深度学习库
Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing
论文作者
论文摘要
深度学习(DL)技术在许多具有挑战性的任务中被证明有效,并在实践中被广泛采用。但是,以前的工作表明,DL库是建立和执行DL模型的基础,包含错误并可能导致严重的后果。不幸的是,现有的测试方法仍然无法全面锻炼DL库。他们利用现有的训练模型,仅检测模型推理阶段中的错误。在这项工作中,我们建议松饼解决这些问题。为此,松饼采用了专门设计的模型模糊方法,该方法使其能够生成各种DL模型来探索目标库,而不是仅依靠现有的训练有素的模型。松饼通过调整一组指标来测量不同DL库之间的不一致,使模型训练阶段的差异测试可行。这样,松饼可以最好地锻炼库代码以检测更多错误。为了评估松饼的有效性,我们对三个广泛使用的DL文库进行实验。结果表明,松饼可以在最新版本的流行DL库中检测39个新错误,包括Tensorflow,CNTK和Theano。
Deep learning (DL) techniques are proven effective in many challenging tasks, and become widely-adopted in practice. However, previous work has shown that DL libraries, the basis of building and executing DL models, contain bugs and can cause severe consequences. Unfortunately, existing testing approaches still cannot comprehensively exercise DL libraries. They utilize existing trained models and only detect bugs in model inference phase. In this work we propose Muffin to address these issues. To this end, Muffin applies a specifically-designed model fuzzing approach, which allows it to generate diverse DL models to explore the target library, instead of relying only on existing trained models. Muffin makes differential testing feasible in the model training phase by tailoring a set of metrics to measure the inconsistencies between different DL libraries. In this way, Muffin can best exercise the library code to detect more bugs. To evaluate the effectiveness of Muffin, we conduct experiments on three widely-used DL libraries. The results demonstrate that Muffin can detect 39 new bugs in the latest release versions of popular DL libraries, including Tensorflow, CNTK, and Theano.