截短的张量schatten p-norm用于时空交通数据插补的方法复杂的模式

论文标题

截短的张量schatten p-norm用于时空交通数据插补的方法复杂的模式

Truncated tensor Schatten p-norm based approach for spatiotemporal traffic data imputation with complicated missing patterns

论文作者

Nie, Tong, Qin, Guoyang, Sun, Jian

论文摘要

传感器，无线通信，云计算和数据科学的快速进步带来了前所未有的数据，以帮助运输工程师和研究人员做出更好的决策。但是，现实中的流量数据通常由于检测器和通信故障而损坏或不完整的值。因此，需要数据插补以确保下游数据驱动的应用程序的有效性。为此，在以前的工作中尝试了许多基于张量的方法将归纳问题视为低级张量完成（LRTC）。为了解决LRTC核心的最小化秩最小化，上述大多数方法都利用张量核标准（NN）作为最小化的凸代替代物。但是，NN中的过度释放问题避免了实践中理想的表现。在本文中，我们为张量（TSPN）定义了创新的非covex截短的Schatten p-norm，以近似张量等级，并在LRTC框架下估计缺失时空流量数据。我们将流量数据建模为三阶张量结构（时间间隔，位置（传感器），天），并引入四个复杂的缺失模式，包括根据张量模式-N纤维的随机缺失和三个类似光纤的缺失病例。尽管目标函数在我们的模型中不存在，但我们通过将乘数的交替方向方法（ADMM）与广义软阈值（GST）整合在一起，从而得出了全局最佳解决方案。此外，我们设计了一种截断率衰减策略来处理不同的缺失率方案。最终使用现实世界时空数据集进行了全面的实验，这表明所提出的LRTC-TSPN方法在各种缺失的情况下都很好地表现，同时在几乎所有情况下都优于其他基于SOTA的基于SOTA的基于SOTA张量的插入模型。

Rapid advances in sensor, wireless communication, cloud computing and data science have brought unprecedented amount of data to assist transportation engineers and researchers in making better decisions. However, traffic data in reality often has corrupted or incomplete values due to detector and communication malfunctions. Data imputation is thus required to ensure the effectiveness of downstream data-driven applications. To this end, numerous tensor-based methods treating the imputation problem as the low-rank tensor completion (LRTC) have been attempted in previous works. To tackle rank minimization, which is at the core of the LRTC, most of aforementioned methods utilize the tensor nuclear norm (NN) as a convex surrogate for the minimization. However, the over-relaxation issue in NN refrains it from desirable performance in practice. In this paper, we define an innovative nonconvex truncated Schatten p-norm for tensors (TSpN) to approximate tensor rank and impute missing spatiotemporal traffic data under the LRTC framework. We model traffic data into a third-order tensor structure of (time intervals,locations (sensors),days) and introduce four complicated missing patterns, including random missing and three fiber-like missing cases according to the tensor mode-n fibers. Despite nonconvexity of the objective function in our model, we derive the global optimal solutions by integrating the alternating direction method of multipliers (ADMM) with generalized soft-thresholding (GST). In addition, we design a truncation rate decay strategy to deal with varying missing rate scenarios. Comprehensive experiments are finally conducted using real-world spatiotemporal datasets, which demonstrate that the proposed LRTC-TSpN method performs well under various missing cases, meanwhile outperforming other SOTA tensor-based imputation models in almost all scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题