论文标题

使用深度哈希的历史文档中的图案斑点和图像检索

Pattern Spotting and Image Retrieval in Historical Documents using Deep Hashing

论文作者

Dias, Caio da S., Britto Jr., Alceu de S., Barddal, Jean P., Heutte, Laurent, Koerich, Alessandro L.

论文摘要

本文提出了一种深度学习方法,用于在历史文档的数字收集中进行图像检索和图案斑点。首先,一个区域建议算法检测文档页面图像中的对象候选。接下来,考虑了两个不同的变体,这些模型用于特征提取,这些变体提供了实用值或二进制代码表示。最后,通过使用给定输入查询计算特征相似性来对候选图像进行排名。一项强大的实验协议评估了DOCEXPLORE图像数据库上的每个表示方案(实用值和二进制代码)的建议方法。实验结果表明,所提出的深层模型与历史文档图像的最新图像检索方法相比,使用相同的技术用于图案斑点,超过了2.56个百分点。此外,与基于实价表示的相关作品相比,提议的方法还将搜索时间减少到200倍,而存储的成本高达6,000倍。

This paper presents a deep learning approach for image retrieval and pattern spotting in digital collections of historical documents. First, a region proposal algorithm detects object candidates in the document page images. Next, deep learning models are used for feature extraction, considering two distinct variants, which provide either real-valued or binary code representations. Finally, candidate images are ranked by computing the feature similarity with a given input query. A robust experimental protocol evaluates the proposed approach considering each representation scheme (real-valued and binary code) on the DocExplore image database. The experimental results show that the proposed deep models compare favorably to the state-of-the-art image retrieval approaches for images of historical documents, outperforming other deep models by 2.56 percentage points using the same techniques for pattern spotting. Besides, the proposed approach also reduces the search time by up to 200x and the storage cost up to 6,000x when compared to related works based on real-valued representations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源