依赖性通过深入的加固学习进行回溯

论文标题

依赖性通过深入的加固学习进行回溯

Dependency Parsing with Backtracking using Deep Reinforcement Learning

论文作者

Dary, Franck, Petit, Maxime, Nasr, Alexis

论文摘要

NLP（例如基于过渡的解析）的贪婪算法容易出现误差传播。克服这个问题的一种方法是允许算法回溯并探索替代解决方案，在新证据与迄今为止探索的解决方案相矛盾的情况下。为了实施这种行为，我们使用增强学习，并在此操作获得更好的奖励的情况下让算法回溯，而不是继续探索当前的解决方案。我们在POS标签和依赖解析上测试了这一想法，并表明回溯是反对错误传播的有效手段。

Greedy algorithms for NLP such as transition based parsing are prone to error propagation. One way to overcome this problem is to allow the algorithm to backtrack and explore an alternative solution in cases where new evidence contradicts the solution explored so far. In order to implement such a behavior, we use reinforcement learning and let the algorithm backtrack in cases where such an action gets a better reward than continuing to explore the current solution. We test this idea on both POS tagging and dependency parsing and show that backtracking is an effective means to fight against error propagation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题