论文标题
文档图像上的自我监督的表示学习
Self-Supervised Representation Learning on Document Images
论文作者
论文摘要
这项工作分析了在文档图像分类的背景下,自我监管的预培训对文档图像的影响。虽然先前的方法探讨了自学对自然图像的影响,但我们表明,基于贴片的预训练在文档图像上的性能较差,因为它们的结构性不同和样本内的语义信息不良。我们提出了两个上下文感知的替代方案,以提高烟草-3482图像分类任务的性能。我们还提出了一种新颖的自我诉讼方法,它利用文档(图像和文本)固有的多模式(图像和文本),其性能比其他流行的自我监督方法(包括受监督的Imagenet预训练)在文档图像分类方案上具有有限的数据。
This work analyses the impact of self-supervised pre-training on document images in the context of document image classification. While previous approaches explore the effect of self-supervision on natural images, we show that patch-based pre-training performs poorly on document images because of their different structural properties and poor intra-sample semantic information. We propose two context-aware alternatives to improve performance on the Tobacco-3482 image classification task. We also propose a novel method for self-supervision, which makes use of the inherent multi-modality of documents (image and text), which performs better than other popular self-supervised methods, including supervised ImageNet pre-training, on document image classification scenarios with a limited amount of data.