用于安排化学生产过程的分配加固学习

论文标题

用于安排化学生产过程的分配加固学习

Distributional Reinforcement Learning for Scheduling of Chemical Production Processes

论文作者

Mowbray, Max, Zhang, Dongda, Chanona, Ehecatl Antonio Del Rio

论文摘要

加强学习（RL）最近受到了流程系统工程和控制社区的极大关注。最近的工作调查了RL在不确定性存在下确定最佳调度决策的应用。在这项工作中，我们提出了一种量身定制的RL方法，该方法旨在有效地解决不确定性存在的生产调度问题。我们考虑对这些问题施加的限制，例如在其他情况下RL自然考虑的优先级和析取约束。此外，这项工作自然能够优化风险敏感的配方，例如有条件的危险价值（CVAR），这在现实的调度过程中至关重要。在平行的批处理生产环境中对拟议的策略进行了彻底的研究，并针对混合整数线性编程（MILP）策略进行了基准测试。我们表明，通过我们的方法确定的政策能够说明在线决策中的植物不确定性，预期性能与现有的MILP方法相当。此外，该框架还获得了优化风险敏感措施的好处，并比最有效的优化方法更快地确定在线决策阶数。这有望减轻实际问题并轻松处理在线生产计划范式中的过程不确定性实现。

Reinforcement Learning (RL) has recently received significant attention from the process systems engineering and control communities. Recent works have investigated the application of RL to identify optimal scheduling decision in the presence of uncertainty. In this work, we present a RL methodology tailored to efficiently address production scheduling problems in the presence of uncertainty. We consider commonly imposed restrictions on these problems such as precedence and disjunctive constraints which are not naturally considered by RL in other contexts. Additionally, this work naturally enables the optimization of risk-sensitive formulations such as the conditional value-at-risk (CVaR), which are essential in realistic scheduling processes. The proposed strategy is investigated thoroughly in a parallel batch production environment, and benchmarked against mixed integer linear programming (MILP) strategies. We show that the policy identified by our approach is able to account for plant uncertainties in online decision-making, with expected performance comparable to existing MILP methods. Additionally, the framework gains the benefits of optimizing for risk-sensitive measures, and identifies online decisions orders of magnitude faster than the most efficient optimization approaches. This promises to mitigate practical issues and ease in handling realizations of process uncertainty in the paradigm of online production scheduling.

下载PDF全文

下载文献需遵守相关版权规定

论文标题