论文标题
在社交媒体上估算代表性不足的用户的主题暴露
Estimating Topic Exposure for Under-Represented Users on Social Media
论文作者
论文摘要
在线社交网络(OSN)有助于访问各种数据,使研究人员能够分析用户的行为并开发用户行为分析模型。这些模型在很大程度上依赖于观察到的数据,这些数据通常由于参与不平等而产生偏差。这种不平等由三组在线用户组成:潜伏者 - 仅消耗内容的用户,招聘者 - 对内容创建的用户和贡献者很少贡献 - 负责创建大多数在线内容的用户。在解释人口水平的利益或情感的同时,未能考虑所有群体的贡献,可能会产生偏见的结果。为了减少贡献者引起的偏见,在这项工作中,我们专注于强调参与者在观察到的数据中的贡献,因为与潜伏者相比,它们更有可能贡献,并且与贡献者相比,它们构成了更大的人口。这些用户行为分析的第一步是找到他们接触但没有互动的主题。为此,我们提出了一个新颖的框架,有助于识别这些用户并估算其主题曝光。暴露估计机制是通过合并来自类似贡献者的行为模式以及用户的人口统计学和个人资料信息来建模的。
Online Social Networks (OSNs) facilitate access to a variety of data allowing researchers to analyze users' behavior and develop user behavioral analysis models. These models rely heavily on the observed data which is usually biased due to the participation inequality. This inequality consists of three groups of online users: the lurkers - users that solely consume the content, the engagers - users that contribute little to the content creation, and the contributors - users that are responsible for creating the majority of the online content. Failing to consider the contribution of all the groups while interpreting population-level interests or sentiments may yield biased results. To reduce the bias induced by the contributors, in this work, we focus on highlighting the engagers' contributions in the observed data as they are more likely to contribute when compared to lurkers, and they comprise a bigger population as compared to the contributors. The first step in behavioral analysis of these users is to find the topics they are exposed to but did not engage with. To do so, we propose a novel framework that aids in identifying these users and estimates their topic exposure. The exposure estimation mechanism is modeled by incorporating behavioral patterns from similar contributors as well as users' demographic and profile information.