论文标题
顶级齿轮或黑色镜子:从非政治内容推断政治倾向
Top Gear or Black Mirror: Inferring Political Leaning From Non-Political Content
论文作者
论文摘要
在明确的政治事件(例如选举)的背景下,经常研究两极分化和回声室,而在非政治背景下,很少有学术研究检查了政治群体的混合。在非政治背景下研究政治两极分化的一个主要障碍是,政治倾向(即左派与右向定向)通常是未知的。尽管如此,众所周知,政治倾向与许多生活方式选择相关联(有时是很强的),导致刻板印象,例如“拿铁咖啡自由主义者”。我们开发了一个机器学习分类器,以推断出从非政治文本中倾斜的政治倾向,并且可以选择地,用户在社交媒体上关注的帐户。我们使用在Twitter上共享的选民建议申请结果作为我们的地面图,并在Twitter数据集上训练并测试我们的分类器,该数据集在删除了使用政治文本的任何推文后,包括每个用户的3200个最新推文。我们正确地对大多数用户的政治倾向进行了分类(F1分数在0.70到0.85之间,具体取决于覆盖范围)。我们发现政治活动水平与我们的分类结果之间没有关系。我们将分类器应用于英国新闻共享的案例研究,发现一般而言,政治新闻的共享表现出独特的左右鸿沟,而体育新闻则没有。
Polarization and echo chambers are often studied in the context of explicitly political events such as elections, and little scholarship has examined the mixing of political groups in non-political contexts. A major obstacle to studying political polarization in non-political contexts is that political leaning (i.e., left vs right orientation) is often unknown. Nonetheless, political leaning is known to correlate (sometimes quite strongly) with many lifestyle choices leading to stereotypes such as the "latte-drinking liberal." We develop a machine learning classifier to infer political leaning from non-political text and, optionally, the accounts a user follows on social media. We use Voter Advice Application results shared on Twitter as our groundtruth and train and test our classifier on a Twitter dataset comprising the 3,200 most recent tweets of each user after removing any tweets with political text. We correctly classify the political leaning of most users (F1 scores range from 0.70 to 0.85 depending on coverage). We find no relationship between the level of political activity and our classification results. We apply our classifier to a case study of news sharing in the UK and discover that, in general, the sharing of political news exhibits a distinctive left-right divide while sports news does not.