论文标题
G-REP:任意对象检测的高斯表示
G-Rep: Gaussian Representation for Arbitrary-Oriented Object Detection
论文作者
论文摘要
任意面向对象检测任务的典型表示包括方向的边界框(OBB),四边形边界框(QBB)和点集(点集)。每种表示都遇到与其特征相对应的问题,例如边界不连续性,类似方形的问题,表示歧义和孤立点,这导致检测不准确。尽管已经提出了许多有效的策略,以实现各种代表,但仍没有统一的解决方案。基于高斯建模的当前检测方法已经证明了破坏这一困境的可能性。但是,它们仍然仅限于OBB。为了进一步,在本文中,我们提出了一个称为G-REP的统一的高斯表示,以构建OBB,QBB和点的高斯分布,该分布可以实现各种表示和问题的统一解决方案。具体而言,将点集或基于QBB的对象表示转换为高斯分布,并使用最大似然估计算法优化其参数。然后,探索了三个可选的高斯指标,以优化检测器的回归损失,因为它们具有出色的参数优化机制。此外,我们还使用高斯指标进行抽样来对齐标签分配和回归损失。对几个公共可用数据集的实验结果,例如DOTA,HRSC2016,UCAS-AOD和ICDAR2015,显示了针对任意面向对象检测的建议方法的出色性能。
Typical representations for arbitrary-oriented object detection tasks include oriented bounding box (OBB), quadrilateral bounding box (QBB), and point set (PointSet). Each representation encounters problems that correspond to its characteristics, such as the boundary discontinuity, square-like problem, representation ambiguity, and isolated points, which lead to inaccurate detection. Although many effective strategies have been proposed for various representations, there is still no unified solution. Current detection methods based on Gaussian modeling have demonstrated the possibility of breaking this dilemma; however, they remain limited to OBB. To go further, in this paper, we propose a unified Gaussian representation called G-Rep to construct Gaussian distributions for OBB, QBB, and PointSet, which achieves a unified solution to various representations and problems. Specifically, PointSet or QBB-based object representations are converted into Gaussian distributions, and their parameters are optimized using the maximum likelihood estimation algorithm. Then, three optional Gaussian metrics are explored to optimize the regression loss of the detector because of their excellent parameter optimization mechanisms. Furthermore, we also use Gaussian metrics for sampling to align label assignment and regression loss. Experimental results on several public available datasets, such as DOTA, HRSC2016, UCAS-AOD, and ICDAR2015, show the excellent performance of the proposed method for arbitrary-oriented object detection.