Urdufake@fire2020：在乌尔都语中的假新闻标识的共享曲目

论文标题

Urdufake@fire2020：在乌尔都语中的假新闻标识的共享曲目

UrduFake@FIRE2020: Shared Track on Fake News Identification in Urdu

论文作者

Amjad, Maaz, Sidorov, Grigori, Zhila, Alisa, Gelbukh, Alexander, Rosso, Paolo

论文摘要

本文概述了2020年Fire 2020中关于乌尔都语的假新闻检测的第一个共享任务。这是一项二进制分类任务，其目标是使用由900个注释的新闻文章组成的数据集识别虚假新闻，并进行400篇新闻文章进行测试。该数据集包含五个领域的新闻：（i）健康，（ii）体育，（iii）Showbiz，（iv）技术和（v）业务。来自6个不同国家（印度，中国，埃及，德国，巴基斯坦和英国）的42个团队登记了这项任务。 9个团队提交了他们的实验结果。参与者使用了各种机器学习方法，从基于功能的传统机器学习到神经网络技术。最佳性能系统的F得分值为0.90，表明基于BERT的方法的表现优于其他机器学习分类器。

This paper gives the overview of the first shared task at FIRE 2020 on fake news detection in the Urdu language. This is a binary classification task in which the goal is to identify fake news using a dataset composed of 900 annotated news articles for training and 400 news articles for testing. The dataset contains news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business. 42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task. 9 teams submitted their experimental results. The participants used various machine learning methods ranging from feature-based traditional machine learning to neural network techniques. The best performing system achieved an F-score value of 0.90, showing that the BERT-based approach outperforms other machine learning classifiers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题