使用全局语义信息增强指示手写文本识别

论文标题

使用全局语义信息增强指示手写文本识别

Enhancing Indic Handwritten Text Recognition Using Global Semantic Information

论文作者

Mondal, Ajoy, Jawahar, C. V.

论文摘要

手写文本识别（HTR）比印刷文本更有趣和具有挑战性，因为作家，内容和时间的手写样式的变化不均匀。 HTR对于指示语言变得更具挑战性，因为（i）多个字符组合形成相连的语言，从而增加了相应语言的字符数量，并且（ii）在每个指示脚本中接近100个独特的基本Unicode字符。最近，已经提出了许多基于编码器框架框架的识别方法来处理此类问题。他们仍然面临许多挑战，例如由于写作风格和墨水密度的不同，图像模糊和不完整的字符。我们认为，大多数编码器解码器方法基于本地视觉特征，而无需明确的全局语义信息。在这项工作中，我们使用全局语义信息提高了指示手写文本识别器的性能。我们在编码器框架中使用语义模块来提取全局语义信息以识别指示手写的文本。语义信息用于监督的编码器和解码器的初始化。语义信息是从预先训练的语言模型的嵌入一词中预测的。广泛的实验表明，所提出的框架在十种指示语言的手写文本上实现了最先进的结果。

Handwritten Text Recognition (HTR) is more interesting and challenging than printed text due to uneven variations in the handwriting style of the writers, content, and time. HTR becomes more challenging for the Indic languages because of (i) multiple characters combined to form conjuncts which increase the number of characters of respective languages, and (ii) near to 100 unique basic Unicode characters in each Indic script. Recently, many recognition methods based on the encoder-decoder framework have been proposed to handle such problems. They still face many challenges, such as image blur and incomplete characters due to varying writing styles and ink density. We argue that most encoder-decoder methods are based on local visual features without explicit global semantic information. In this work, we enhance the performance of Indic handwritten text recognizers using global semantic information. We use a semantic module in an encoder-decoder framework for extracting global semantic information to recognize the Indic handwritten texts. The semantic information is used in both the encoder for supervision and the decoder for initialization. The semantic information is predicted from the word embedding of a pre-trained language model. Extensive experiments demonstrate that the proposed framework achieves state-of-the-art results on handwritten texts of ten Indic languages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题