论文标题

从公开数据中推断城市社交网络

Inferring urban social networks from publicly available data

论文作者

Guarino, Stefano, Mastrostefano, Enrico, Bernaschi, Massimo, Celestini, Alessandro, Cianfriglia, Marco, Torre, Davide, Zastrow, Lena

论文摘要

在文献中,社交网络的出现以及合成而现实的社交图的合成但现实的社交图的定义是文献中广泛研究的问题。通过不与任何真实数据绑定,随机图模型无法捕获真实网络的所有微妙之处,并且在许多实际情况下不足,包括研究领域,例如计算流行病学,最近在议程上很高。同时,所谓的接触网络描述了交互而不是关系,并且很大程度上取决于应用程序的应用以及用于推断它们的样本数据的大小和质量。为了填补这两种方法之间的空白,我们提出了一个以数据驱动的城市社交网络模型,并作为开源软件实现并发布。鉴于感兴趣的领域,仅基于广泛可用的汇总人口统计和社交混合数据,我们构建了一个年龄分层的,地理参考的合成人群,其个体通过两种类型的“牢固联系”相连:内部内部(例如,亲属关系)或友谊。尽管家庭链接完全是数据驱动的,但我们基于距离和年龄差异起作用的假设,提出了一个友谊的参数概率模型,并且并非所有个人都同样具有社交性。在不同的配置下,通过针对三个不同规模的三个意大利城市进行了广泛的模拟,对获得的网络结构的人口统计学和地理因素进行了彻底研究。

The emergence of social networks and the definition of suitable generative models for synthetic yet realistic social graphs are widely studied problems in the literature. By not being tied to any real data, random graph models cannot capture all the subtleties of real networks and are inadequate for many practical contexts -- including areas of research, such as computational epidemiology, which are recently high on the agenda. At the same time, the so-called contact networks describe interactions, rather than relationships, and are strongly dependent on the application and on the size and quality of the sample data used to infer them. To fill the gap between these two approaches, we present a data-driven model for urban social networks, implemented and released as open source software. Given a territory of interest, and only based on widely available aggregated demographic and social-mixing data, we construct an age-stratified and geo-referenced synthetic population whose individuals are connected by "strong ties" of two types: intra-household (e.g., kinship) or friendship. While household links are entirely data-driven, we propose a parametric probabilistic model for friendship, based on the assumption that distances and age differences play a role, and that not all individuals are equally sociable. The demographic and geographic factors governing the structure of the obtained network, under different configurations, are thoroughly studied through extensive simulations focused on three Italian cities of different size.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源