全局HRTF插值通过学习的超调节特征的仿射转化

论文标题

全局HRTF插值通过学习的超调节特征的仿射转化

Global HRTF Interpolation via Learned Affine Transformation of Hyper-conditioned Features

论文作者

Lee, Jin Woo, Lee, Sungho, Lee, Kyogu

论文摘要

估计任意源点的与头部相关的转移函数（HRTF）对于沉浸式双耳音频渲染至关重要。计算每个人的HRTF是具有挑战性的，因为传统方法需要昂贵的时间和计算资源，而现代数据驱动的方法则是渴望数据的。特别是对于数据驱动的方法，现有的HRTF数据集在源位置的空间采样分布方面有所不同，在跨多个数据集概括该方法时提出了一个主要问题。为了减轻这一点，我们提出了一种基于新型调节体系结构的深度学习方法。提出的方法可以通过插值已知分布的HRTF来预测任何位置的HRTF。实验结果表明，所提出的体系结构可以通过各种坐标系提高模型在数据集之间的普遍性。其他示范表明，该模型在定量和感知度量中均可重建来自空间下采样的HRTF的目标HRTF。

Estimating Head-Related Transfer Functions (HRTFs) of arbitrary source points is essential in immersive binaural audio rendering. Computing each individual's HRTFs is challenging, as traditional approaches require expensive time and computational resources, while modern data-driven approaches are data-hungry. Especially for the data-driven approaches, existing HRTF datasets differ in spatial sampling distributions of source positions, posing a major problem when generalizing the method across multiple datasets. To alleviate this, we propose a deep learning method based on a novel conditioning architecture. The proposed method can predict an HRTF of any position by interpolating the HRTFs of known distributions. Experimental results show that the proposed architecture improves the model's generalizability across datasets with various coordinate systems. Additional demonstrations show that the model robustly reconstructs the target HRTFs from the spatially downsampled HRTFs in both quantitative and perceptual measures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题