论文标题
岩浆:地面模糊基准
Magma: A Ground-Truth Fuzzing Benchmark
论文作者
论文摘要
高可伸缩性和低运行成本使模糊测试了发现软件错误的事实上的标准。在构建终极发现发现工具的比赛中,不断改进模糊技术。然而,尽管模糊不清在野外寻找错误,但由于缺乏指标和基准,评估和比较绒毛性能是具有挑战性的。例如,由于重复数据删除技术的不完美,崩溃计数可能是最常用的性能指标,这是不准确的。此外,缺乏一组统一的目标导致临时评估阻碍了公平的比较。 我们通过开发岩浆来解决这些问题。通过将真实错误引入真实软件,岩浆允许对模糊器进行现实的评估,以针对广泛的目标进行。通过启动这些错误,岩浆还可以使以fuzz子独立于模糊为中心的性能指标收集。岩浆是一个开放的基准测试,该基准由七个目标组成,这些目标执行各种输入操作和复杂的计算,对最先进的模糊剂构成了挑战。 我们针对超过200,000个CPU小时的岩浆评估了七个广泛使用的基于突变的模糊器(AFL,AFLFAST,AFL ++,Fairfuzz,Mopt-AFL,Honggfuzz和Symcc-Afl)。基于到达,触发和检测到的错误数量,我们得出了有关模糊探索和检测功能的结论。这提供了对模糊性能评估的见解,强调了地面真理在执行更准确和有意义的评估中的重要性。
High scalability and low running costs have made fuzz testing the de facto standard for discovering software bugs. Fuzzing techniques are constantly being improved in a race to build the ultimate bug-finding tool. However, while fuzzing excels at finding bugs in the wild, evaluating and comparing fuzzer performance is challenging due to the lack of metrics and benchmarks. For example, crash count, perhaps the most commonly-used performance metric, is inaccurate due to imperfections in deduplication techniques. Additionally, the lack of a unified set of targets results in ad hoc evaluations that hinder fair comparison. We tackle these problems by developing Magma, a ground-truth fuzzing benchmark that enables uniform fuzzer evaluation and comparison. By introducing real bugs into real software, Magma allows for the realistic evaluation of fuzzers against a broad set of targets. By instrumenting these bugs, Magma also enables the collection of bug-centric performance metrics independent of the fuzzer. Magma is an open benchmark consisting of seven targets that perform a variety of input manipulations and complex computations, presenting a challenge to state-of-the-art fuzzers. We evaluate seven widely-used mutation-based fuzzers (AFL, AFLFast, AFL++, FairFuzz, MOpt-AFL, honggfuzz, and SymCC-AFL) against Magma over 200,000 CPU-hours. Based on the number of bugs reached, triggered, and detected, we draw conclusions about the fuzzers' exploration and detection capabilities. This provides insight into fuzzer performance evaluation, highlighting the importance of ground truth in performing more accurate and meaningful evaluations.