基于不对称相似性的有导复合网络中链接预测的方法

论文标题

基于不对称相似性的有导复合网络中链接预测的方法

An Approach for Link Prediction in Directed Complex Networks based on Asymmetric Similarity-Popularity

论文作者

Benhidour, Hafida, Almeshkhas, Lama, Kerrache, Said

论文摘要

复杂的网络是代表现实生活系统的图形，这些系统表现出在纯粹的常规或完全随机图中未发现的独特特征。由于基础过程的复杂性，对此类系统的研究至关重要，但具有挑战性。然而，由于大量网络数据的可用性，近几十年来，这项任务变得更加容易。复杂网络中的链接预测旨在估计网络中缺少两个节点之间的链接的可能性。链接可能由于数据收集的不完美而缺少，或仅仅是因为它们尚未出现。发现网络数据中实体之间的新关系吸引了研究人员在社会学，计算机科学，物理学和生物学等各个领域的关注。大多数现有的研究都集中在无向复杂网络中的链接预测上。但是，并非所有现实生活中的系统都可以忠实地表示为无向网络。当使用链接预测算法时，通常会做出这种简化的假设，但不可避免地会导致有关节点之间关系和预测性能中降解的信息的丢失。本文介绍了针对有向网络的明确设计的链接预测方法。它基于相似性范式，该范式最近在非方向网络中证明了成功。提出的算法通过在相似性和受欢迎程度上将其建模为不对称性来处理节点关系中的不对称性。鉴于观察到的网络拓扑结构，该算法将隐藏的相似性近似于使用边缘权重的最短路径距离，从而捕获并取消链接的不对称性和节点的受欢迎程度。提出的方法在现实生活网络上进行了评估，实验结果证明了其在预测各种网络数据类型和大小的丢失链接方面的有效性。

Complex networks are graphs representing real-life systems that exhibit unique characteristics not found in purely regular or completely random graphs. The study of such systems is vital but challenging due to the complexity of the underlying processes. This task has nevertheless been made easier in recent decades thanks to the availability of large amounts of networked data. Link prediction in complex networks aims to estimate the likelihood that a link between two nodes is missing from the network. Links can be missing due to imperfections in data collection or simply because they are yet to appear. Discovering new relationships between entities in networked data has attracted researchers' attention in various domains such as sociology, computer science, physics, and biology. Most existing research focuses on link prediction in undirected complex networks. However, not all real-life systems can be faithfully represented as undirected networks. This simplifying assumption is often made when using link prediction algorithms but inevitably leads to loss of information about relations among nodes and degradation in prediction performance. This paper introduces a link prediction method designed explicitly for directed networks. It is based on the similarity-popularity paradigm, which has recently proven successful in undirected networks. The presented algorithms handle the asymmetry in node relationships by modeling it as asymmetry in similarity and popularity. Given the observed network topology, the algorithms approximate the hidden similarities as shortest path distances using edge weights that capture and factor out the links' asymmetry and nodes' popularity. The proposed approach is evaluated on real-life networks, and the experimental results demonstrate its effectiveness in predicting missing links across a broad spectrum of networked data types and sizes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题