论文标题
那不勒斯;从非同步和高频数据中挖掘铅滞后关系
NAPLES;Mining the lead-lag Relationship from Non-synchronous and High-frequency Data
论文作者
论文摘要
在时间序列分析中,“铅滞后效应”一词用于描述对另一个时间序列引起的给定时间序列的延迟效应。铅滞后效应在实践中无处不在,对于制定高频交易中的投资策略至关重要。目前,分析铅滞后效应有三个主要挑战。首先,在实际应用中,并非所有时间序列都同步观察到。其次,相关数据集的大小和环境变化速率越来越快,在特定时间限制内完成计算变得越来越困难。第三,某些铅滞后效应是时间变化的,仅持续一段时间,它们的延迟长度通常受到外部因素的影响。在本文中,我们提出了那不勒斯(负和正铅延迟估计器),这是一种解决所有这些问题的新统计措施。通过对人工和真实数据集的实验,我们证明那不勒斯与实际的铅滞后效应有很强的相关性,包括由大量宏观经济公告触发的效果。
In time-series analysis, the term "lead-lag effect" is used to describe a delayed effect on a given time series caused by another time series. lead-lag effects are ubiquitous in practice and are specifically critical in formulating investment strategies in high-frequency trading. At present, there are three major challenges in analyzing the lead-lag effects. First, in practical applications, not all time series are observed synchronously. Second, the size of the relevant dataset and rate of change of the environment is increasingly faster, and it is becoming more difficult to complete the computation within a particular time limit. Third, some lead-lag effects are time-varying and only last for a short period, and their delay lengths are often affected by external factors. In this paper, we propose NAPLES (Negative And Positive lead-lag EStimator), a new statistical measure that resolves all these problems. Through experiments on artificial and real datasets, we demonstrate that NAPLES has a strong correlation with the actual lead-lag effects, including those triggered by significant macroeconomic announcements.