论文标题
带有Tsallis熵的潜在因子模型的探索性控制
Exploratory Control with Tsallis Entropy for Latent Factor Models
论文作者
论文摘要
我们研究了具有潜在因素的模型中的最佳控制,在这些模型中,代理在离散和连续时间内控制了动作而不是动作本身的分布。为了鼓励探索国家空间,我们奖励了tsallis熵的探索,并得出了对状态的最佳分布 - 我们证明这是$ q $ -Gaussian分布,其位置是通过解决方案的FBS $δ$ E和FBSDE在离散和持续时间的情况下进行的。我们讨论最佳勘探问题解决方案与标准动态最佳控制解决方案之间的关系。最后,我们沿着软$ q $ - 学习的线条设置模型不合时宜的设置制定了最佳策略。该方法可以应用于例如制定更强大的统计套利交易策略。
We study optimal control in models with latent factors where the agent controls the distribution over actions, rather than actions themselves, in both discrete and continuous time. To encourage exploration of the state space, we reward exploration with Tsallis Entropy and derive the optimal distribution over states - which we prove is $q$-Gaussian distributed with location characterized through the solution of an FBS$Δ$E and FBSDE in discrete and continuous time, respectively. We discuss the relation between the solutions of the optimal exploration problems and the standard dynamic optimal control solution. Finally, we develop the optimal policy in a model-agnostic setting along the lines of soft $Q$-learning. The approach may be applied in, e.g., developing more robust statistical arbitrage trading strategies.