Superocr：从光学角色识别到图像字幕的转换

论文标题

Superocr：从光学角色识别到图像字幕的转换

SuperOCR: A Conversion from Optical Character Recognition to Image Captioning

论文作者

Sun, Baohua, Lin, Michael, Sha, Hao, Yang, Lin

论文摘要

光学特征识别（OCR）具有许多现实世界应用。现有方法通常检测到字符的位置，然后识别每个检测到的位置的字符。因此，字符识别的准确性受到字符检测的性能的影响。在本文中，我们提出了一种识别字符的方法，而无需检测每个字符的位置。这是通过将OCR任务转换为图像字幕任务来完成的。提出的方法的一个优点是，在训练过程中不需要字符的标记边界框。实验结果表明，所提出的方法在车牌识别和水表角色识别任务上的现有方法优于现有方法。所提出的方法还将其部署到连接到覆盆子PI 3的低功率（300MW）CNN加速器芯片中，以进行依赖性应用。

Optical Character Recognition (OCR) has many real world applications. The existing methods normally detect where the characters are, and then recognize the character for each detected location. Thus the accuracy of characters recognition is impacted by the performance of characters detection. In this paper, we propose a method for recognizing characters without detecting the location of each character. This is done by converting the OCR task into an image captioning task. One advantage of the proposed method is that the labeled bounding boxes for the characters are not needed during training. The experimental results show the proposed method outperforms the existing methods on both the license plate recognition and the watermeter character recognition tasks. The proposed method is also deployed into a low-power (300mW) CNN accelerator chip connected to a Raspberry Pi 3 for on-device applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题