论文标题
太平洋:朝着主动的对话问题回答金融中的表格和文本数据
PACIFIC: Towards Proactive Conversational Question Answering over Tabular and Textual Data in Finance
论文作者
论文摘要
为了在金融中的混合环境中促进对话式问题回答(CQA),我们提出了一个名为Pacific的新数据集。与现有的CQA数据集相比,太平洋展示了三个关键特征:(i)积极性,(ii)数值推理以及(iii)表和文本的混合上下文。相应地定义了一项新任务,以研究主动的对话问题回答(PCQA),该问题结合了澄清问题的生成和CQA。此外,我们提出了一种新颖的方法,即unipcqa,以使PCQA中的输入和输出内容的混合形式适应SEQ2SEQ问题,包括将数值推理过程的重新制定为代码生成。 UNIPCQA对PCQA中的所有子任务进行了多任务学习,并结合了一种简单的合奏策略,以通过交叉验证的顶级$ K $ k $采样的SEQ2SEQ输出来减轻多任务学习中的错误传播问题。我们使用广泛的基线对太平洋数据集进行了基准测试,并对PCQA的每个子任务进行全面评估。
To facilitate conversational question answering (CQA) over hybrid contexts in finance, we present a new dataset, named PACIFIC. Compared with existing CQA datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical reasoning, and (iii) hybrid context of tables and text. A new task is defined accordingly to study Proactive Conversational Question Answering (PCQA), which combines clarification question generation and CQA. In addition, we propose a novel method, namely UniPCQA, to adapt a hybrid format of input and output content in PCQA into the Seq2Seq problem, including the reformulation of the numerical reasoning process as code generation. UniPCQA performs multi-task learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy to alleviate the error propagation issue in the multi-task learning by cross-validating top-$k$ sampled Seq2Seq outputs. We benchmark the PACIFIC dataset with extensive baselines and provide comprehensive evaluations on each sub-task of PCQA.