论文标题
通过解决方案多样化概括数学单词问题解决者
Generalizing Math Word Problem Solvers via Solution Diversification
论文作者
论文摘要
当前的数学单词问题(MWP)求解器通常是SEQ2SEQ模型,该模型由(一个问题;一个解决方案)对训练,每对都是由问题描述制成的,并且一个解决方案显示了推理流以获取正确答案的解决方案。但是,一个MWP问题自然具有多个解决方案方程。使用(单问题;一项)对MWP求解器的训练不包括其他正确的解决方案,因此限制了MWP求解器的推广性。解决此限制的一种可行解决方案是将多个解决方案扩展到给定问题。但是,很难通过人类的努力来收集多样化,准确的增强解决方案。在本文中,我们通过引入解决方案缓冲区和解决方案鉴别器来为MWP求解器设计一个新的培训框架。缓冲区包括MWP求解器生成的解决方案,以鼓励培训数据多样性。判别器控制缓冲解决方案的质量以参与培训。我们的框架灵活地适用于所有SEQ2SEQ MWP求解器的全面,半弱和弱监督的培训。我们在基准数据集MATH23K和名为FeeB12K的新数据集上进行了广泛的实验,并表明我们的框架通过生成正确和多样的解决方案在不同设置下改善了各种MWP求解器的性能。
Current math word problem (MWP) solvers are usually Seq2Seq models trained by the (one-problem; one-solution) pairs, each of which is made of a problem description and a solution showing reasoning flow to get the correct answer. However, one MWP problem naturally has multiple solution equations. The training of an MWP solver with (one-problem; one-solution) pairs excludes other correct solutions, and thus limits the generalizability of the MWP solver. One feasible solution to this limitation is to augment multiple solutions to a given problem. However, it is difficult to collect diverse and accurate augment solutions through human efforts. In this paper, we design a new training framework for an MWP solver by introducing a solution buffer and a solution discriminator. The buffer includes solutions generated by an MWP solver to encourage the training data diversity. The discriminator controls the quality of buffered solutions to participate in training. Our framework is flexibly applicable to a wide setting of fully, semi-weakly and weakly supervised training for all Seq2Seq MWP solvers. We conduct extensive experiments on a benchmark dataset Math23k and a new dataset named Weak12k, and show that our framework improves the performance of various MWP solvers under different settings by generating correct and diverse solutions.