基于SEQ-2-SEQ的基于ASR输出的口语名称捕获的改进

论文标题

基于SEQ-2-SEQ的基于ASR输出的口语名称捕获的改进

Seq-2-Seq based Refinement of ASR Output for Spoken Name Capture

论文作者

Singla, Karan, Jalalvand, Shahab, Kim, Yeon-Jun, Price, Ryan, Pressel, Daniel, Bangalore, Srinivas

论文摘要

人的名字从人类言语中捕获是人机对话中的一项艰巨任务。在本文中，我们提出了一种新颖的方法，以响应提示“说出您的名字和拼写您的名字/姓氏”的提示，以捕获呼叫者的名字。受到咒语校正，拆卸和文本归一化的工作的启发，我们提出了一个轻巧的seq-2-seq系统，该系统从不同的用户输入中生成名称咒语。我们提出的方法优于基于基于LM驱动的规则方法的强基线。

Person name capture from human speech is a difficult task in human-machine conversations. In this paper, we propose a novel approach to capture the person names from the caller utterances in response to the prompt "say and spell your first/last name". Inspired from work on spell correction, disfluency removal and text normalization, we propose a lightweight Seq-2-Seq system which generates a name spell from a varying user input. Our proposed method outperforms the strong baseline which is based on LM-driven rule-based approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题