从移动应用使用视频中提取可重播的互动

论文标题

从移动应用使用视频中提取可重播的互动

Extracting Replayable Interactions from Videos of Mobile App Usage

论文作者

Chen, Jieshan, Swearngin, Amanda, Wu, Jason, Barik, Titus, Nichols, Jeffrey, Zhang, Xiaoyi

论文摘要

移动应用程序的屏幕录制是一种流行且易于使用的方法，供用户分享其与应用程序的交互方式，例如在在线教程视频，用户评论或错误报告中的附件中。不幸的是，仅凭视频像素数据，人们和系统都很难再现触摸驱动的互动。在本文中，我们介绍了一种在移动应用程序视频中提取和重播用户互动的方法，仅使用视频帧中的像素信息。为了确定相互作用，我们将基于启发式的图像处理和卷积深度学习应用于细分屏幕记录，对每个段中的相互作用进行分类并找到相互作用点。要在另一台设备上重播交互，我们使用UI元素检测匹配应用程序屏幕上的元素。我们使用两个数据集评估了基于像素的方法的可行性：RICO移动应用数据集以及具有iOS和Android版本的64个应用程序的新数据集。我们发现，我们的端到端方法可以成功重播不同设备上的大多数互动（iOS-84.1％，Android-78.4％），这是迈向支持各种场景的一步，包括在现有视频中自动注释交互，自动化UI测试，自动化UI测试以及创建Interactive App App tutorials。

Screen recordings of mobile apps are a popular and readily available way for users to share how they interact with apps, such as in online tutorial videos, user reviews, or as attachments in bug reports. Unfortunately, both people and systems can find it difficult to reproduce touch-driven interactions from video pixel data alone. In this paper, we introduce an approach to extract and replay user interactions in videos of mobile apps, using only pixel information in video frames. To identify interactions, we apply heuristic-based image processing and convolutional deep learning to segment screen recordings, classify the interaction in each segment, and locate the interaction point. To replay interactions on another device, we match elements on app screens using UI element detection. We evaluate the feasibility of our pixel-based approach using two datasets: the Rico mobile app dataset and a new dataset of 64 apps with both iOS and Android versions. We find that our end-to-end approach can successfully replay a majority of interactions (iOS--84.1%, Android--78.4%) on different devices, which is a step towards supporting a variety of scenarios, including automatically annotating interactions in existing videos, automated UI testing, and creating interactive app tutorials.

下载PDF全文

下载文献需遵守相关版权规定

论文标题