通过梯度指导通过自我训练来提高事件提取

论文标题

通过梯度指导通过自我训练来提高事件提取

Improve Event Extraction via Self-Training with Gradient Guidance

论文作者

Xu, Zhiyang, Lee, Jay-Yoon, Huang, Lifu

论文摘要

数据稀缺性一直是阻碍事件提取进展的主要因素。为了克服这个问题，我们提出了一种使用反馈（STF）框架的自我训练，该框架利用了大规模的未标记数据，并通过将其与同一句子的抽象含义表示（AMR）图进行比较，从未标记的数据中获取了每个新事件预测的反馈。 Specifically, STF consists of (1) a base event extraction model trained on existing event annotations and then applied to large-scale unlabeled corpora to predict new event mentions as pseudo training samples, and (2) a novel scoring model that takes in each new predicted event trigger, an argument, its argument role, as well as their paths in the AMR graph to estimate a compatibility score indicating the correctness of the pseudo label.兼容性得分进一步充当反馈，以鼓励或阻止自训练期间伪标签上的模型学习。在包括ACE05-E，ACE05-E+和ERE在内的三个基准数据集上的实验结果证明了STF框架对事件提取的有效性，尤其是事件参数提取，并且在基本事件提取模型和强质基础上具有显着的性能提高。我们的实验分析进一步表明，STF是一个通用框架，因为即使没有高质量的AMR图注释，也可以通过利用大规模未标记的数据来改善大多数（如果不是全部）事件提取模型。

Data scarcity has been the main factor that hinders the progress of event extraction. To overcome this issue, we propose a Self-Training with Feedback (STF) framework that leverages the large-scale unlabeled data and acquires feedback for each new event prediction from the unlabeled data by comparing it to the Abstract Meaning Representation (AMR) graph of the same sentence. Specifically, STF consists of (1) a base event extraction model trained on existing event annotations and then applied to large-scale unlabeled corpora to predict new event mentions as pseudo training samples, and (2) a novel scoring model that takes in each new predicted event trigger, an argument, its argument role, as well as their paths in the AMR graph to estimate a compatibility score indicating the correctness of the pseudo label. The compatibility scores further act as feedback to encourage or discourage the model learning on the pseudo labels during self-training. Experimental results on three benchmark datasets, including ACE05-E, ACE05-E+, and ERE, demonstrate the effectiveness of the STF framework on event extraction, especially event argument extraction, with significant performance gain over the base event extraction models and strong baselines. Our experimental analysis further shows that STF is a generic framework as it can be applied to improve most, if not all, event extraction models by leveraging large-scale unlabeled data, even when high-quality AMR graph annotations are not available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题