论文标题

无监督的域改编,用于使用频段统计匹配的声学场景分类

Unsupervised Domain Adaptation for Acoustic Scene Classification Using Band-Wise Statistics Matching

论文作者

Mezza, Alessandro Ilic, Habets, Emanuël A. P., Müller, Meinard, Sarti, Augusto

论文摘要

已知机器学习算法的性能会受到培训(源)和测试(目标)数据分布之间可能不匹配的可能不匹配的负面影响。实际上,每当对给定设备记录的数据进行培训的声学场景分类系统都将应用于在不同的声学条件下收购或通过错配记录设备捕获的样品时,就会出现此问题。为了解决这个问题,我们提出了一种无监督的域适应方法,该方法包括将目标域声学场景的每个频段的一阶和二阶样本统计与源域训练数据集的一个频段保持一致。这种模型不合时宜的方法是为了调整看不见设备的音频样本在将其喂入预训练的分类器之前的,从而避免了任何进一步的学习阶段。使用Dcase 2018 Task 1-B开发数据集,我们表明,所提出的方法优于文献中最新的无监督方法,就源和目标域分类精度而言。

The performance of machine learning algorithms is known to be negatively affected by possible mismatches between training (source) and test (target) data distributions. In fact, this problem emerges whenever an acoustic scene classification system which has been trained on data recorded by a given device is applied to samples acquired under different acoustic conditions or captured by mismatched recording devices. To address this issue, we propose an unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset. This model-agnostic approach is devised to adapt audio samples from unseen devices before they are fed to a pre-trained classifier, thus avoiding any further learning phase. Using the DCASE 2018 Task 1-B development dataset, we show that the proposed method outperforms the state-of-the-art unsupervised methods found in the literature in terms of both source- and target-domain classification accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源