论文标题

两阶段生成的对抗网络,用于文档图像二进制,颜色噪声和背景删除

Two-stage generative adversarial networks for document image binarization with color noise and background removal

论文作者

Suh, Sungho, Kim, Jihun, Lukowicz, Paul, Lee, Yong Oh

论文摘要

文档图像增强和二进制方法通常用于提高文档图像分析任务(例如文本识别)的准确性和效率。传统的非计算方法的方法是在低级特征上以无监督的方式构建的,但在背景严重降低的文档上二进制很难。基于卷积神经网络的方法仅关注灰度图像和本地文本特征。在本文中,我们建议使用生成对抗性神经网络提出一种两阶段的颜色文档图像增强和二进制方法。在第一阶段,训练了四个独立于颜色的对抗网络,以从输入图像中提取颜色前景信息,以增强文档图像。在第二阶段,对两个具有全球和局部功能的独立对抗网络进行了训练,以用于可变大小文档的图像二进制。对于对抗性神经网络,我们在具有编码器解码器结构的歧视器和发电机之间制定损失功能。实验结果表明,所提出的方法比文档图像二进制竞赛(DIBCO)数据集,LRDE文档二进制数据集(LRDE DBD)以及我们的运输标签数据集的许多经典和最先进的算法更好的性能更好。我们计划在github.com/opensuh/documentbinarization/上发布运输标签数据集以及我们的实施代码。

Document image enhancement and binarization methods are often used to improve the accuracy and efficiency of document image analysis tasks such as text recognition. Traditional non-machine-learning methods are constructed on low-level features in an unsupervised manner but have difficulty with binarization on documents with severely degraded backgrounds. Convolutional neural network-based methods focus only on grayscale images and on local textual features. In this paper, we propose a two-stage color document image enhancement and binarization method using generative adversarial neural networks. In the first stage, four color-independent adversarial networks are trained to extract color foreground information from an input image for document image enhancement. In the second stage, two independent adversarial networks with global and local features are trained for image binarization of documents of variable size. For the adversarial neural networks, we formulate loss functions between a discriminator and generators having an encoder-decoder structure. Experimental results show that the proposed method achieves better performance than many classical and state-of-the-art algorithms over the Document Image Binarization Contest (DIBCO) datasets, the LRDE Document Binarization Dataset (LRDE DBD), and our shipping label image dataset. We plan to release the shipping label dataset as well as our implementation code at github.com/opensuh/DocumentBinarization/.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源