使用Dacritic字符在图像中对文本的智障语言识别

论文标题

使用Dacritic字符在图像中对文本的智障语言识别

On-Device Language Identification of Text in Images using Diacritic Characters

论文作者

Vatsal, Shubham, Arora, Nikhil, Ramena, Gopi, Moharana, Sukumar, Jain, Dhruval, Purre, Naresh, Munjal, Rachit S

论文摘要

变音符号可以被视为一组独特的字符，为我们提供足够和重要的线索，以识别具有相当高的精度的给定语言。虽然与语音相关的变量通常是许多语言的区别特征，尤其是具有拉丁文脚本的语言。在这项拟议的工作中，我们旨在使用大声字符的存在来识别图像中文本语言，以改善任何给定的自动化环境中的光学特征识别（OCR）性能。我们展示了我们的作品，包括13种拉丁语，其中包含85个音符字符。我们使用类似于Squeezedet的体系结构来检测大变字符，然后是浅网络，以最终识别语言。 OCR系统伴随着确定的语言参数倾向于比OCR系统的唯一部署产生更好的结果。除了确保OCR结果的改进之外，讨论的工作还将在模型大小和推理时间方面考虑到设备（手机）的约束。

Diacritic characters can be considered as a unique set of characters providing us with adequate and significant clue in identifying a given language with considerably high accuracy. Diacritics, though associated with phonetics often serve as a distinguishing feature for many languages especially the ones with a Latin script. In this proposed work, we aim to identify language of text in images using the presence of diacritic characters in order to improve Optical Character Recognition (OCR) performance in any given automated environment. We showcase our work across 13 Latin languages encompassing 85 diacritic characters. We use an architecture similar to Squeezedet for object detection of diacritic characters followed by a shallow network to finally identify the language. OCR systems when accompanied with identified language parameter tends to produce better results than sole deployment of OCR systems. The discussed work apart from guaranteeing an improvement in OCR results also takes on-device (mobile phone) constraints into consideration in terms of model size and inference time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题