论文标题

拥挤:检测社交媒体中事实检查的主张

CrowdChecked: Detecting Previously Fact-Checked Claims in Social Media

论文作者

Hardalov, Momchil, Chernyavskiy, Anton, Koychev, Ivan, Ilvovsky, Dmitry, Nakov, Preslav

论文摘要

尽管开发系统以自动进行事实检查的系统取得了长足进展,但在用户眼中,它们仍然缺乏信誉。因此,出现了一种有趣的方法:通过验证专业事实检查者以前对输入索赔进行事实检查,并返回一篇解释其决定的文章来进行自动事实检查。这是一种明智的方法,因为人们信任手动事实检查,并且随着许多主张被多次重复。但是,建立此类系统的一个主要问题是少数已知的推文 - 验证可用于培训的文章对。在这里,我们的目标是利用人群事实检查,即在社交​​媒体上的采矿主张来弥合这一差距,用户在社交媒体上的采矿主张以与事实核对文章的链接有关。特别是,我们挖掘了330,000条推文以及相应的事实核对文章的大规模集合。我们进一步提出了一个端到端框架,以在遥远的监督场景中,基于修改后的自适应培训从这些嘈杂的数据中学习。我们在Clef'21 Checkthat上进行的实验!测试集显示出对艺术状态的改善,绝对是两个点。我们的代码和数据集可在https://github.com/mhardalov/crowdchecked-claim上获得

While there has been substantial progress in developing systems to automate fact-checking, they still lack credibility in the eyes of the users. Thus, an interesting approach has emerged: to perform automatic fact-checking by verifying whether an input claim has been previously fact-checked by professional fact-checkers and to return back an article that explains their decision. This is a sensible approach as people trust manual fact-checking, and as many claims are repeated multiple times. Yet, a major issue when building such systems is the small number of known tweet--verifying article pairs available for training. Here, we aim to bridge this gap by making use of crowd fact-checking, i.e., mining claims in social media for which users have responded with a link to a fact-checking article. In particular, we mine a large-scale collection of 330,000 tweets paired with a corresponding fact-checking article. We further propose an end-to-end framework to learn from this noisy data based on modified self-adaptive training, in a distant supervision scenario. Our experiments on the CLEF'21 CheckThat! test set show improvements over the state of the art by two points absolute. Our code and datasets are available at https://github.com/mhardalov/crowdchecked-claims

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源