论文标题
与上下文启动的自动编码器的类别学习
Category-Learning with Context-Augmented Autoencoder
论文作者
论文摘要
寻找现实世界数据的可解释的非冗余表示是机器学习的关键问题之一。众所周知,生物神经网络以无监督的方式很好地解决了这个问题,但是无监督的人工神经网络要么努力执行此问题,要么需要单独进行每个任务进行微调。我们将其与以下事实联系在一起,即生物大脑在观察结果之间的关系中学习,而人工网络则没有。我们还注意到,尽管幼稚的数据增强技术对于监督学习问题可能非常有用,但自动编码器通常无法概括来自数据增强的转换。因此,我们认为,提供有关数据样本之间关系的其他知识将提高模型找到有用的内部数据表示的能力。更正式的是,我们将数据集视为多种多样,而是一个类别,其中示例是对象。如果两个对象实际上代表了同一实体的不同转换,则两个对象是通过态度连接的。在这种形式主义之后,我们提出了一种新的方法,即在训练自动编码器时使用数据增强。我们以这样的方式训练一个变性自动编码器,以使辅助网络可以根据隐藏的表示可以预测的转换结果。我们认为,线性分类器在学会表示形式上的分类精度是衡量其可解释性的好指标。在我们的实验中,当前的方法的表现优于$β$ -VAE,并且与高斯混合物VAE相当。
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning. Biological neural networks are known to solve this problem quite well in unsupervised manner, yet unsupervised artificial neural networks either struggle to do it or require fine tuning for each task individually. We associate this with the fact that a biological brain learns in the context of the relationships between observations, while an artificial network does not. We also notice that, though a naive data augmentation technique can be very useful for supervised learning problems, autoencoders typically fail to generalize transformations from data augmentations. Thus, we believe that providing additional knowledge about relationships between data samples will improve model's capability of finding useful inner data representation. More formally, we consider a dataset not as a manifold, but as a category, where the examples are objects. Two these objects are connected by a morphism, if they actually represent different transformations of the same entity. Following this formalism, we propose a novel method of using data augmentations when training autoencoders. We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network in terms of the hidden representation. We believe that the classification accuracy of a linear classifier on the learned representation is a good metric to measure its interpretability. In our experiments, present approach outperforms $β$-VAE and is comparable with Gaussian-mixture VAE.