论文标题
仇恨是新的流行病:仇恨言论扩散在Twitter上的主题感知的建模
Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter
论文作者
论文摘要
在线仇恨言论,尤其是在Twitter等微博平台上,可以说是过去十年中最严重的问题。一些国家报告说,恶意仇恨运动激怒了仇恨犯罪。虽然仇恨言论的发现是新兴的研究领域之一,但信息网络中主题依赖性仇恨的产生和传播仍然不足。在这项工作中,我们专注于探索用户行为,这触发了Twitter上的仇恨言论的起源及其如何通过转发进行分散。我们抓取大型推文,转发,用户活动历史记录和追随者网络的大规模数据集,其中包括超过4100万美元的唯一用户的1.61亿条推文。我们还收集了在线发表的600k现代新闻文章。我们表征了控制这些动态的不同信息信号。我们的分析在仇恨存在与通常信息传播的情况下区分了扩散动力学。这促使我们在具有现实世界知识的主题感知环境中提出建模问题。为了预测任何给定主题标签的仇恨言语的启动,我们提出了多个功能丰富的模型,表现最好的宏F1得分为0.65。同时,为了预测Twitter上的转发动力学,我们提出了一种新型的神经结构,该神经结构使用缩放点产生的注意力融合了外源性影响。 Retina的宏F1得分为0.85,表现优于多个最先进的模型。我们的分析揭示了与现有扩散模型相比,视网膜预测可恶内容的转发动态的最高功能。
Online hate speech, particularly over microblogging platforms like Twitter, has emerged as arguably the most severe issue of the past decade. Several countries have reported a steep rise in hate crimes infuriated by malicious hate campaigns. While the detection of hate speech is one of the emerging research areas, the generation and spread of topic-dependent hate in the information network remain under-explored. In this work, we focus on exploring user behaviour, which triggers the genesis of hate speech on Twitter and how it diffuses via retweets. We crawl a large-scale dataset of tweets, retweets, user activity history, and follower networks, comprising over 161 million tweets from more than $41$ million unique users. We also collect over 600k contemporary news articles published online. We characterize different signals of information that govern these dynamics. Our analyses differentiate the diffusion dynamics in the presence of hate from usual information diffusion. This motivates us to formulate the modelling problem in a topic-aware setting with real-world knowledge. For predicting the initiation of hate speech for any given hashtag, we propose multiple feature-rich models, with the best performing one achieving a macro F1 score of 0.65. Meanwhile, to predict the retweet dynamics on Twitter, we propose RETINA, a novel neural architecture that incorporates exogenous influence using scaled dot-product attention. RETINA achieves a macro F1-score of 0.85, outperforming multiple state-of-the-art models. Our analysis reveals the superlative power of RETINA to predict the retweet dynamics of hateful content compared to the existing diffusion models.