论文标题
W-NUT 2020的UCD-CS共享任务3:社交媒体上的COVID-19事件提取的文本方法
UCD-CS at W-NUT 2020 Shared Task-3: A Text to Text Approach for COVID-19 Event Extraction on Social Media
论文作者
论文摘要
在本文中,我们在共享任务中描述了我们的方法:从Twitter中提取事件。该任务的目的是从共同相关的推文中提取答案,从一组预定的老虎机填充问题。我们的方法通过利用基于变压器的T5文本到文本模型来将事件提取任务视为答案任务的问题。 根据官方评估分数,即F1,与其他参与跑步相比,我们提交的跑步绩效(前3名)。但是,我们认为该评估可能会低估基于文本生成的实际绩效。尽管某些这样的运行可能会很好地回答插槽问题,但它们可能不是金标准答案的确切弦匹配。为了衡量这种低估的程度,我们采用了一种简单的精确转换方法,旨在将良好的预测转换为精确匹配的预测。结果表明,在此转换之后,我们的整体跑步达到了与共同参与的最佳跑步和最先进的F1分数相同的性能水平。我们的代码公开可用以帮助可重复性
In this paper, we describe our approach in the shared task: COVID-19 event extraction from Twitter. The objective of this task is to extract answers from COVID-related tweets to a set of predefined slot-filling questions. Our approach treats the event extraction task as a question answering task by leveraging the transformer-based T5 text-to-text model. According to the official evaluation scores returned, namely F1, our submitted run achieves competitive performance compared to other participating runs (Top 3). However, we argue that this evaluation may underestimate the actual performance of runs based on text-generation. Although some such runs may answer the slot questions well, they may not be an exact string match for the gold standard answers. To measure the extent of this underestimation, we adopt a simple exact-answer transformation method aiming at converting the well-answered predictions to exactly-matched predictions. The results show that after this transformation our run overall reaches the same level of performance as the best participating run and state-of-the-art F1 scores in three of five COVID-related events. Our code is publicly available to aid reproducibility