少更多：数据效率复杂的问题在知识基础上回答

论文标题

少更多：数据效率复杂的问题在知识基础上回答

Less is More: Data-Efficient Complex Question Answering over Knowledge Bases

论文作者

Hua, Yuncheng, Li, Yuan-Fang, Qi, Guilin, Wu, Wei, Zhang, Jingyao, Qi, Daiqing

论文摘要

问题回答是从知识库（KB）获取信息的有效方法。在本文中，我们提出了神经符号复杂的问题答案（NS-CQA）模型，这是一个仅通过仅使用适度数量的培训样本来回答复杂问题的数据有效的增强学习框架。我们的框架由神经发生器和符号执行者组成，分别将自然语言问题转换为一系列原始动作，并在知识库中执行它们以计算答案。我们仔细制定了一组原始的符号动作，这使我们不仅可以简化神经网络设计，还可以加速模型收敛。为了减少搜索空间，我们在编码器架构中采用副本和掩盖机制，以大大减少解码器输出词汇并提高模型的推广性。我们为模型配备了存储缓冲区，该存储缓冲区存储了高回报有希望的程序。此外，我们提出了自适应奖励功能。通过将生成的试验与存储在记忆缓冲液中的试验进行比较，我们得出了课程指导的奖励奖金，即接近性和新颖性。为了减轻稀疏的奖励问题，我们结合了自适应奖励和奖励奖金，将稀疏的奖励重塑为密集的反馈。此外，我们鼓励该模型生成新的试验，以避免模仿虚假试验，同时使模型记住过去的高回报试验以提高数据效率。我们的NS-CQA模型在两个数据集上进行了评估：CQA，最近一个大规模的复杂问题答案数据集，以及WebQuestionsSpsssp，这是一个多跳的问题答案数据集。在两个数据集上，我们的模型都优于最先进的模型。值得注意的是，在CQA上，NS-CQA在复杂性较高的问题上表现良好，而仅使用总训练样本的1％。

Question answering is an effective method for obtaining information from knowledge bases (KB). In this paper, we propose the Neural-Symbolic Complex Question Answering (NS-CQA) model, a data-efficient reinforcement learning framework for complex question answering by using only a modest number of training samples. Our framework consists of a neural generator and a symbolic executor that, respectively, transforms a natural-language question into a sequence of primitive actions, and executes them over the knowledge base to compute the answer. We carefully formulate a set of primitive symbolic actions that allows us to not only simplify our neural network design but also accelerate model convergence. To reduce search space, we employ the copy and masking mechanisms in our encoder-decoder architecture to drastically reduce the decoder output vocabulary and improve model generalizability. We equip our model with a memory buffer that stores high-reward promising programs. Besides, we propose an adaptive reward function. By comparing the generated trial with the trials stored in the memory buffer, we derive the curriculum-guided reward bonus, i.e., the proximity and the novelty. To mitigate the sparse reward problem, we combine the adaptive reward and the reward bonus, reshaping the sparse reward into dense feedback. Also, we encourage the model to generate new trials to avoid imitating the spurious trials while making the model remember the past high-reward trials to improve data efficiency. Our NS-CQA model is evaluated on two datasets: CQA, a recent large-scale complex question answering dataset, and WebQuestionsSP, a multi-hop question answering dataset. On both datasets, our model outperforms the state-of-the-art models. Notably, on CQA, NS-CQA performs well on questions with higher complexity, while only using approximately 1% of the total training samples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题