使用大语言模型启用与移动UI的对话互动

论文标题

使用大语言模型启用与移动UI的对话互动

Enabling Conversational Interaction with Mobile UI using Large Language Models

论文作者

Wang, Bryan, Li, Gang, Li, Yang

论文摘要

会话代理显示了允许用户使用语言与移动设备进行交互的承诺。但是，要使用自然语言执行不同的UI任务，开发人员通常需要为每个特定任务创建单独的数据集和模型，这是昂贵且耗费的。最近，预先训练的大型语言模型（LLMS）被证明能够概括到目标任务中的少数示例时，可以概括到各种下游任务。本文研究了使用单个LLM与移动UI进行多功能对话相互作用的可行性。我们设计了提示技术，以使LLM适应移动UI。我们尝试了四个重要的建模任务，这些任务解决了对话交互中的各种情况。我们的方法在这些具有挑战性的任务上实现了竞争性能，而无需专门的数据集和培训，提供了轻巧且可推广的方法来启用基于语言的移动互动。

Conversational agents show the promise to allow users to interact with mobile devices using language. However, to perform diverse UI tasks with natural language, developers typically need to create separate datasets and models for each specific task, which is expensive and effort-consuming. Recently, pre-trained large language models (LLMs) have been shown capable of generalizing to various downstream tasks when prompted with a handful of examples from the target task. This paper investigates the feasibility of enabling versatile conversational interactions with mobile UIs using a single LLM. We designed prompting techniques to adapt an LLM to mobile UIs. We experimented with four important modeling tasks that address various scenarios in conversational interaction. Our method achieved competitive performance on these challenging tasks without requiring dedicated datasets and training, offering a lightweight and generalizable approach to enable language-based mobile interaction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题