太平洋：朝着主动的对话问题回答金融中的表格和文本数据

论文标题

太平洋：朝着主动的对话问题回答金融中的表格和文本数据

PACIFIC: Towards Proactive Conversational Question Answering over Tabular and Textual Data in Finance

论文作者

Deng, Yang, Lei, Wenqiang, Zhang, Wenxuan, Lam, Wai, Chua, Tat-Seng

论文摘要

为了在金融中的混合环境中促进对话式问题回答（CQA），我们提出了一个名为Pacific的新数据集。与现有的CQA数据集相比，太平洋展示了三个关键特征：（i）积极性，（ii）数值推理以及（iii）表和文本的混合上下文。相应地定义了一项新任务，以研究主动的对话问题回答（PCQA），该问题结合了澄清问题的生成和CQA。此外，我们提出了一种新颖的方法，即unipcqa，以使PCQA中的输入和输出内容的混合形式适应SEQ2SEQ问题，包括将数值推理过程的重新制定为代码生成。 UNIPCQA对PCQA中的所有子任务进行了多任务学习，并结合了一种简单的合奏策略，以通过交叉验证的顶级$ K $ k $采样的SEQ2SEQ输出来减轻多任务学习中的错误传播问题。我们使用广泛的基线对太平洋数据集进行了基准测试，并对PCQA的每个子任务进行全面评估。

To facilitate conversational question answering (CQA) over hybrid contexts in finance, we present a new dataset, named PACIFIC. Compared with existing CQA datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical reasoning, and (iii) hybrid context of tables and text. A new task is defined accordingly to study Proactive Conversational Question Answering (PCQA), which combines clarification question generation and CQA. In addition, we propose a novel method, namely UniPCQA, to adapt a hybrid format of input and output content in PCQA into the Seq2Seq problem, including the reformulation of the numerical reasoning process as code generation. UniPCQA performs multi-task learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy to alleviate the error propagation issue in the multi-task learning by cross-validating top-$k$ sampled Seq2Seq outputs. We benchmark the PACIFIC dataset with extensive baselines and provide comprehensive evaluations on each sub-task of PCQA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题