论文标题

图形用户界面的对象检测:老式或深度学习或组合?

Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?

论文作者

Chen, Jieshan, Xie, Mulong, Xing, Zhenchang, Chen, Chunyang, Xu, Xiwei, Zhu, Liming, Li, Guoqiang

论文摘要

在GUI图像中检测图形用户界面(GUI)元素是特定于域的对象检测任务。它支持许多软件工程任务,例如GUI动画和测试,GUI搜索和代码生成。 GUI元素检测的现有研究直接从计算机视觉(CV)域中借用了成熟的方法,包括依靠传统图像处理功能(例如Canny Edge,Countours)和深度学习模型的老式域,这些方法学会从大型GUI数据中检测到。不幸的是,这些简历方法最初不是针对GUI和GUI元素独特特征以及GUI元素检测任务的高本地化精度的认识。我们对超过50k GUI图像的七种代表性GUI元素检测方法进行了首次大规模实证研究,以了解这些方法的功能,局限性和有效设计。这项研究不仅阐明了要解决的技术挑战,还可以告知新的GUI元素检测方法的设计。因此,我们针对非文本GUI元素检测设计了一种新的GUI特定的老式方法,该方法采用了新颖的自上而下的粗到精细策略,并将其与GUI文本检测的成熟深度学习模型结合起来。我们对25,000张GUI图像的评估表明,我们的方法显着进步GUI元素检测中的启动性能。

Detecting Graphical User Interface (GUI) elements in GUI images is a domain-specific object detection task. It supports many software engineering tasks, such as GUI animation and testing, GUI search and code generation. Existing studies for GUI element detection directly borrow the mature methods from computer vision (CV) domain, including old fashioned ones that rely on traditional image processing features (e.g., canny edge, contours), and deep learning models that learn to detect from large-scale GUI data. Unfortunately, these CV methods are not originally designed with the awareness of the unique characteristics of GUIs and GUI elements and the high localization accuracy of the GUI element detection task. We conduct the first large-scale empirical study of seven representative GUI element detection methods on over 50k GUI images to understand the capabilities, limitations and effective designs of these methods. This study not only sheds the light on the technical challenges to be addressed but also informs the design of new GUI element detection methods. We accordingly design a new GUI-specific old-fashioned method for non-text GUI element detection which adopts a novel top-down coarse-to-fine strategy, and incorporate it with the mature deep learning model for GUI text detection.Our evaluation on 25,000 GUI images shows that our method significantly advances the start-of-the-art performance in GUI element detection.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源