通过操作测试评估关键安全系统：自动驾驶汽车的研究

论文标题

通过操作测试评估关键安全系统：自动驾驶汽车的研究

Assessing Safety-Critical Systems from Operational Testing: A Study on Autonomous Vehicles

论文作者

Zhao, Xingyu, Salako, Kizito, Strigini, Lorenzo, Robu, Valentin, Flynn, David

论文摘要

背景：证明安全关键系统（SCS）的高度可靠性和安全性仍然是一个困难的问题。需要以严格的方式将多样的证据合并：特别是，操作测试的结果与设计和验证的其他证据。通过排除获得保证的大多数建立方法，对SCS中的机器学习的使用越来越大，使运营测试对支持安全性和可靠性索赔的重要性更为重要。目的：我们使用自动驾驶汽车（AV）作为当前示例来重新审视证明高可靠性的问题。 AV在公共道路上首次亮相：迫切需要评估AV是否足够安全的方法。我们演示了如何回答评估AV类型的5个问题，从一项引用的研究提出的问题开始。方法：我们采用了扩展保守贝叶斯推论（CBI）的新定理，该定理利用了贝叶斯方法的严格性，同时降低了与贝叶斯推论的现有滥用相关的非自愿滥用风险；我们定义将这些方法应用于AV所需的其他条件。结果：如果AV设计在道路测试之前对安全性有很高的期望，那么先验知识可以带来很大的优势。我们还展示了幼稚的保守评估尝试如何导致过度优势；为什么推断脱离趋势不适合安全索赔？使用AV已转化为压力较小的环境的知识。结论：虽然某些可靠性目标将保持太高而无法实际可验证，但CBI消除了一个主要的疑问来源：它允许使用先验知识而不会引起危险的乐观偏见。对于所需的可靠性和先前信念的某些范围，CBI支持可行的，有声音的论点。有用的保守主张可以从有限的先验知识中得出。

Context: Demonstrating high reliability and safety for safety-critical systems (SCSs) remains a hard problem. Diverse evidence needs to be combined in a rigorous way: in particular, results of operational testing with other evidence from design and verification. Growing use of machine learning in SCSs, by precluding most established methods for gaining assurance, makes operational testing even more important for supporting safety and reliability claims. Objective: We use Autonomous Vehicles (AVs) as a current example to revisit the problem of demonstrating high reliability. AVs are making their debut on public roads: methods for assessing whether an AV is safe enough are urgently needed. We demonstrate how to answer 5 questions that would arise in assessing an AV type, starting with those proposed by a highly-cited study. Method: We apply new theorems extending Conservative Bayesian Inference (CBI), which exploit the rigour of Bayesian methods while reducing the risk of involuntary misuse associated with now-common applications of Bayesian inference; we define additional conditions needed for applying these methods to AVs. Results: Prior knowledge can bring substantial advantages if the AV design allows strong expectations of safety before road testing. We also show how naive attempts at conservative assessment may lead to over-optimism instead; why extrapolating the trend of disengagements is not suitable for safety claims; use of knowledge that an AV has moved to a less stressful environment. Conclusion: While some reliability targets will remain too high to be practically verifiable, CBI removes a major source of doubt: it allows use of prior knowledge without inducing dangerously optimistic biases. For certain ranges of required reliability and prior beliefs, CBI thus supports feasible, sound arguments. Useful conservative claims can be derived from limited prior knowledge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题