论文标题
部分可观测时空混沌系统的无模型预测
Arbitrary Shape Text Detection via Segmentation with Probability Maps
论文作者
论文摘要
任意形状的文本检测是一项具有挑战性的任务,这是由于大小和宽高比,任意方向或形状,不准确的注释等各种变化的任务,因为像素级预测的可伸缩性,基于细分的方法可以适应各种形状的文本,因此最近引起了相当大的关注。但是,文本的准确像素级注释是可怕的,现有的场景文本检测数据集仅提供粗粒的边界注释。因此,始终存在注释内部的许多错误分类的文本像素或背景像素,从而降低了基于分割的文本检测方法的性能。一般来说,像素是否属于文本与相邻注释边界的距离高度相关。通过这一观察结果,在本文中,我们通过概率图提出了一种创新且可靠的基于分割的检测方法,以准确检测文本实例。为了具体,我们采用Sigmoid Alpha函数(SAF)将边界及其内部像素之间的距离传递到概率图。但是,由于粗粒度文本边界注释的不确定性,一个概率图无法很好地覆盖复杂的概率分布。因此,我们采用一组由一系列Sigmoid alpha函数计算出的概率图来描述可能的概率分布。此外,我们提出了一个迭代模型,以学习预测和吸收概率图,以提供足够的信息来重建文本实例。最后,采用简单的区域生长算法来汇总概率图以完成文本实例。实验结果表明,我们的方法在几个基准上的检测准确性方面实现了最先进的性能。
Arbitrary shape text detection is a challenging task due to the significantly varied sizes and aspect ratios, arbitrary orientations or shapes, inaccurate annotations, etc. Due to the scalability of pixel-level prediction, segmentation-based methods can adapt to various shape texts and hence attracted considerable attention recently. However, accurate pixel-level annotations of texts are formidable, and the existing datasets for scene text detection only provide coarse-grained boundary annotations. Consequently, numerous misclassified text pixels or background pixels inside annotations always exist, degrading the performance of segmentation-based text detection methods. Generally speaking, whether a pixel belongs to text or not is highly related to the distance with the adjacent annotation boundary. With this observation, in this paper, we propose an innovative and robust segmentation-based detection method via probability maps for accurately detecting text instances. To be concrete, we adopt a Sigmoid Alpha Function (SAF) to transfer the distances between boundaries and their inside pixels to a probability map. However, one probability map can not cover complex probability distributions well because of the uncertainty of coarse-grained text boundary annotations. Therefore, we adopt a group of probability maps computed by a series of Sigmoid Alpha Functions to describe the possible probability distributions. In addition, we propose an iterative model to learn to predict and assimilate probability maps for providing enough information to reconstruct text instances. Finally, simple region growth algorithms are adopted to aggregate probability maps to complete text instances. Experimental results demonstrate that our method achieves state-of-the-art performance in terms of detection accuracy on several benchmarks.