论文标题

太平洋:朝着主动的对话问题回答金融中的表格和文本数据

PACIFIC: Towards Proactive Conversational Question Answering over Tabular and Textual Data in Finance

论文作者

Deng, Yang, Lei, Wenqiang, Zhang, Wenxuan, Lam, Wai, Chua, Tat-Seng

论文摘要

为了在金融中的混合环境中促进对话式问题回答(CQA),我们提出了一个名为Pacific的新数据集。与现有的CQA数据集相比,太平洋展示了三个关键特征:(i)积极性,(ii)数值推理以及(iii)表和文本的混合上下文。相应地定义了一项新任务,以研究主动的对话问题回答(PCQA),该问题结合了澄清问题的生成和CQA。此外,我们提出了一种新颖的方法,即unipcqa,以使PCQA中的输入和输出内容的混合形式适应SEQ2SEQ问题,包括将数值推理过程的重新制定为代码生成。 UNIPCQA对PCQA中的所有子任务进行了多任务学习,并结合了一种简单的合奏策略,以通过交叉验证的顶级$ K $ k $采样的SEQ2SEQ输出来减轻多任务学习中的错误传播问题。我们使用广泛的基线对太平洋数据集进行了基准测试,并对PCQA的每个子任务进行全面评估。

To facilitate conversational question answering (CQA) over hybrid contexts in finance, we present a new dataset, named PACIFIC. Compared with existing CQA datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical reasoning, and (iii) hybrid context of tables and text. A new task is defined accordingly to study Proactive Conversational Question Answering (PCQA), which combines clarification question generation and CQA. In addition, we propose a novel method, namely UniPCQA, to adapt a hybrid format of input and output content in PCQA into the Seq2Seq problem, including the reformulation of the numerical reasoning process as code generation. UniPCQA performs multi-task learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy to alleviate the error propagation issue in the multi-task learning by cross-validating top-$k$ sampled Seq2Seq outputs. We benchmark the PACIFIC dataset with extensive baselines and provide comprehensive evaluations on each sub-task of PCQA.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源