文档图像上的自我监督的表示学习

论文标题

文档图像上的自我监督的表示学习

Self-Supervised Representation Learning on Document Images

论文作者

Cosma, Adrian, Ghidoveanu, Mihai, Panaitescu-Liess, Michael, Popescu, Marius

论文摘要

这项工作分析了在文档图像分类的背景下，自我监管的预培训对文档图像的影响。虽然先前的方法探讨了自学对自然图像的影响，但我们表明，基于贴片的预训练在文档图像上的性能较差，因为它们的结构性不同和样本内的语义信息不良。我们提出了两个上下文感知的替代方案，以提高烟草-3482图像分类任务的性能。我们还提出了一种新颖的自我诉讼方法，它利用文档（图像和文本）固有的多模式（图像和文本），其性能比其他流行的自我监督方法（包括受监督的Imagenet预训练）在文档图像分类方案上具有有限的数据。

This work analyses the impact of self-supervised pre-training on document images in the context of document image classification. While previous approaches explore the effect of self-supervision on natural images, we show that patch-based pre-training performs poorly on document images because of their different structural properties and poor intra-sample semantic information. We propose two context-aware alternatives to improve performance on the Tobacco-3482 image classification task. We also propose a novel method for self-supervision, which makes use of the inherent multi-modality of documents (image and text), which performs better than other popular self-supervised methods, including supervised ImageNet pre-training, on document image classification scenarios with a limited amount of data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题