论文标题
结构化时间序列预测没有结构性的先验
Structured Time Series Prediction without Structural Prior
论文作者
论文摘要
时间序列预测是许多领域(医学,地球科学,网络分析,金融,计量经济学等)应用程序的广泛且研究的问题。在多元时间序列的情况下,良好性能的关键是正确捕获变体之间的依赖项。通常,这些变体是结构化的,即它们位于抽象空间中,通常代表物理世界的一个方面,预测等于随着时间的推移,信息扩散的形式。文献中已经提出了几种扩散的神经网络模型。但是,大多数现有的建议依赖于对空间结构的一些先验知识,通常以图表的形式称重其点的成对扩散能力。我们认为,由于数据已经包含扩散能力信息,并且比通常从通常的手工制作的图形中获得的数据已经包含扩散能力信息,因此通常可以分配此信息。相反,我们提出了一个完全数据驱动的模型,该模型不依赖于这样的图形,也不依赖任何其他先前的结构信息。我们进行了第一组实验,以衡量基线模型中使用的结构先验性能的影响,并表明,除了在非常低的数据水平下,它仍然可以忽略不计,并且超出了阈值,甚至可能变得有害。然后,我们通过第二组实验研究了我们模型的能力,这两个方面:缺少数据和域的适应性处理。
Time series prediction is a widespread and well studied problem with applications in many domains (medical, geoscience, network analysis, finance, econometry etc.). In the case of multivariate time series, the key to good performances is to properly capture the dependencies between the variates. Often, these variates are structured, i.e. they are localised in an abstract space, usually representing an aspect of the physical world, and prediction amounts to a form of diffusion of the information across that space over time. Several neural network models of diffusion have been proposed in the literature. However, most of the existing proposals rely on some a priori knowledge on the structure of the space, usually in the form of a graph weighing the pairwise diffusion capacity of its points. We argue that this piece of information can often be dispensed with, since data already contains the diffusion capacity information, and in a more reliable form than that obtained from the usually largely hand-crafted graphs. We propose instead a fully data-driven model which does not rely on such a graph, nor any other prior structural information. We conduct a first set of experiments to measure the impact on performance of a structural prior, as used in baseline models, and show that, except at very low data levels, it remains negligible, and beyond a threshold, it may even become detrimental. We then investigate, through a second set of experiments, the capacity of our model in two respects: treatment of missing data and domain adaptation.