基于方案的测试减少和多模块自动驾驶系统的优先级

论文标题

基于方案的测试减少和多模块自动驾驶系统的优先级

Scenario-Based Test Reduction and Prioritization for Multi-Module Autonomous Driving Systems

论文作者

Deng, Yao, Zheng, Xi, Zhang, Mengshi, Lou, Guannan, Zhang, Tianyi

论文摘要

在开发自动驾驶系统（AD）时，开发人员通常需要重新播放以前收集的驾驶记录，以检查新引入系统更改的正确性。但是，考虑到录音中驾驶场景的高冗余（例如，在高速公路上保持相同的车道10分钟），简单地重播整个录音。在本文中，我们提出了一种新的多模块AD测试和优先测试方法。首先，我们的方法会在驾驶记录中自动编码框架，以根据驾驶场景模式为矢量。然后，根据连续向量的相似性将给定的记录切成段。将冗长的段截断以减少记录的长度，并删除具有相同矢量的冗余段。根据驾驶场景的覆盖范围和稀有性，将其余部分优先考虑。我们在行业层面，称为Apollo的多模块广告中实施了这种方法，并在各种回归环境中的三个路线图上对其进行了评估。结果表明，我们的方法将原始记录大大降低了34％以上，同时保持可比的测试效果，从而确定了几乎所有注射的故障。此外，就检测到的平均故障（APFD）和TOP-K的平均百分比而言，我们的测试优先级方法比三个基准的提高了约22％至39％和41％至53％。

When developing autonomous driving systems (ADS), developers often need to replay previously collected driving recordings to check the correctness of newly introduced changes to the system. However, simply replaying the entire recording is not necessary given the high redundancy of driving scenes in a recording (e.g., keeping the same lane for 10 minutes on a highway). In this paper, we propose a novel test reduction and prioritization approach for multi-module ADS. First, our approach automatically encodes frames in a driving recording to feature vectors based on a driving scene schema. Then, the given recording is sliced into segments based on the similarity of consecutive vectors. Lengthy segments are truncated to reduce the length of a recording and redundant segments with the same vector are removed. The remaining segments are prioritized based on both the coverage and the rarity of driving scenes. We implemented this approach on an industry level, multi-module ADS called Apollo and evaluated it on three road maps in various regression settings. The results show that our approach significantly reduced the original recordings by over 34% while keeping comparable test effectiveness, identifying almost all injected faults. Furthermore, our test prioritization method achieves about 22% to 39% and 41% to 53% improvements over three baselines in terms of both the average percentage of faults detected (APFD) and TOP-K.

下载PDF全文

下载文献需遵守相关版权规定

论文标题