论文标题
AB/BA分析:估计关键字发现召回改进的框架,同时保持音频隐私
AB/BA analysis: A framework for estimating keyword spotting recall improvement while maintaining audio privacy
论文作者
论文摘要
在现实的隐私约束下,对检测语音中检测关键字的关键字发现(KWS)系统的评估是一项具有挑战性的任务。 KWS旨在仅在存在关键字时收集数据,从而限制了可能包含假否定性的硬样品的可用性,并防止从生产数据中直接估算模型召回。或者,从其他来源收集的互补数据可能无法完全代表实际应用。在这项工作中,我们提出了一种评估技术,我们称为AB/BA分析。我们的框架对基线模型A评估了候选KWS模型B,使用跨数据集离线解码进行相对召回估计,而无需负示例。此外,我们提出了一个假设的公式,即使假阳性的数量很少,允许差异较低的模型之间的相对误报率估算。最后,我们建议利用机器生成的软标签,在我们称为半监督的AB/BA分析的技术中,可以改善分析时间,隐私和成本。模拟和真实数据的实验表明,AB/BA分析成功地衡量了召回率的改进,并以相对假阳性率以相对假阳性的权衡。
Evaluation of keyword spotting (KWS) systems that detect keywords in speech is a challenging task under realistic privacy constraints. The KWS is designed to only collect data when the keyword is present, limiting the availability of hard samples that may contain false negatives, and preventing direct estimation of model recall from production data. Alternatively, complementary data collected from other sources may not be fully representative of the real application. In this work, we propose an evaluation technique which we call AB/BA analysis. Our framework evaluates a candidate KWS model B against a baseline model A, using cross-dataset offline decoding for relative recall estimation, without requiring negative examples. Moreover, we propose a formulation with assumptions that allow estimation of relative false positive rate between models with low variance even when the number of false positives is small. Finally, we propose to leverage machine-generated soft labels, in a technique we call Semi-Supervised AB/BA analysis, that improves the analysis time, privacy, and cost. Experiments with both simulation and real data show that AB/BA analysis is successful at measuring recall improvement in conjunction with the trade-off in relative false positive rate.