论文标题

Urdufake@fire2020:在乌尔都语中的假新闻标识的共享曲目

UrduFake@FIRE2020: Shared Track on Fake News Identification in Urdu

论文作者

Amjad, Maaz, Sidorov, Grigori, Zhila, Alisa, Gelbukh, Alexander, Rosso, Paolo

论文摘要

本文概述了2020年Fire 2020中关于乌尔都语的假新闻检测的第一个共享任务。这是一项二进制分类任务,其目标是使用由900个注释的新闻文章组成的数据集识别虚假新闻,并进行400篇新闻文章进行测试。该数据集包含五个领域的新闻:(i)健康,(ii)体育,(iii)Showbiz,(iv)技术和(v)业务。来自6个不同国家(印度,中国,埃及,德国,巴基斯坦和英国)的42个团队登记了这项任务。 9个团队提交了他们的实验结果。参与者使用了各种机器学习方法,从基于功能的传统机器学习到神经网络技术。最佳性能系统的F得分值为0.90,表明基于BERT的方法的表现优于其他机器学习分类器。

This paper gives the overview of the first shared task at FIRE 2020 on fake news detection in the Urdu language. This is a binary classification task in which the goal is to identify fake news using a dataset composed of 900 annotated news articles for training and 400 news articles for testing. The dataset contains news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business. 42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task. 9 teams submitted their experimental results. The participants used various machine learning methods ranging from feature-based traditional machine learning to neural network techniques. The best performing system achieved an F-score value of 0.90, showing that the BERT-based approach outperforms other machine learning classifiers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源