论文标题
兴趣聚类系数:针对Twitter等定向网络的新指标
Interest Clustering Coefficient: a New Metric for Directed Networks like Twitter
论文作者
论文摘要
我们在这里研究定向社会图的聚类。引入了群集系数,以捕获朋友的朋友往往是我的朋友的社会现象。该指标已被广泛研究,并证明描述社会图的特征具有很大的兴趣。实际上,群集系数适用于链接未指向的图,例如友谊链接(Facebook)或专业链接(LinkedIn)。对于将链接从信息源到信息消费者的图表,它不够足够。我们表明,以前的研究错过了此类图的指示部分中包含的许多信息。因此,我们引入了一个新的度量标准,以测量具有兴趣链接的有向社交图的聚类,即兴趣聚类系数。我们在一个非常大的社交图上计算它(准确并使用采样方法),带有5.05亿用户的Twitter快照和230亿个链接。我们还提供了以前引入的定向和无向指标的值,这是如此大的快照。我们表明,兴趣聚类系数比文献中引入的经典定向聚类系数要大。这表明了指标捕获有向图的信息方面的相关性。
We study here the clustering of directed social graphs. The clustering coefficient has been introduced to capture the social phenomena that a friend of a friend tends to be my friend. This metric has been widely studied and has shown to be of great interest to describe the characteristics of a social graph. In fact, the clustering coefficient is adapted for a graph in which the links are undirected, such as friendship links (Facebook) or professional links (LinkedIn). For a graph in which links are directed from a source of information to a consumer of information, it is no more adequate. We show that former studies have missed much of the information contained in the directed part of such graphs. We thus introduce a new metric to measure the clustering of a directed social graph with interest links, namely the interest clustering coefficient. We compute it (exactly and using sampling methods) on a very large social graph, a Twitter snapshot with 505 million users and 23 billion links. We additionally provide the values of the formerly introduced directed and undirected metrics, a first on such a large snapshot. We exhibit that the interest clustering coefficient is larger than classic directed clustering coefficients introduced in the literature. This shows the relevancy of the metric to capture the informational aspects of directed graphs.