论文标题
基于扩散的深度学习
Diffusion-based Deep Active Learning
论文作者
论文摘要
深神经网络的显着性能取决于大量标记数据的可用性。为了减轻数据注释的负载,主动学习旨在选择最小的训练点集,以标记,以产生最大模型的准确性。大多数现有方法实施了旨在探索数据和标签的联合分布的“探索”型选择标准,或者旨在旨在定位检测到的决策范围的“完善”型标准。我们提出了一个多功能和有效的标准,该标准在分布充分映射时会自动从探索转换为改进。我们的标准依赖于从神经网络提供的数据集的隐藏表示形式构造的图表上扩散现有标签信息的过程。该图表示捕获了近似标记函数的固有几何形状。基于扩散的标准被证明是有利的,因为它优于现有的深入积极学习标准。
The remarkable performance of deep neural networks depends on the availability of massive labeled data. To alleviate the load of data annotation, active deep learning aims to select a minimal set of training points to be labelled which yields maximal model accuracy. Most existing approaches implement either an `exploration'-type selection criterion, which aims at exploring the joint distribution of data and labels, or a `refinement'-type criterion which aims at localizing the detected decision boundaries. We propose a versatile and efficient criterion that automatically switches from exploration to refinement when the distribution has been sufficiently mapped. Our criterion relies on a process of diffusing the existing label information over a graph constructed from the hidden representation of the data set as provided by the neural network. This graph representation captures the intrinsic geometry of the approximated labeling function. The diffusion-based criterion is shown to be advantageous as it outperforms existing criteria for deep active learning.