论文标题
模拟电路设计具有DYNA风格的增强学习
Analog Circuit Design with Dyna-Style Reinforcement Learning
论文作者
论文摘要
在这项工作中,我们提出了一种基于学习的模拟电路设计方法,其目标是优化受某些设计约束的电路性能。使这个问题挑战优化的方面之一是,测量使用模拟的候选配置的性能在计算上可能很昂贵,尤其是在Layout后设计中。此外,大量的设计约束以及相关数量之间的相互作用使问题变得复杂。因此,为了更好地促进支持人类设计师,希望获得有关可行解决方案的整个空间的知识。为了应对这些挑战,我们从基于模型的强化学习中汲取灵感,并提出一种具有两个关键特性的方法。首先,它学习一个奖励模型,即神经网络近似性能的替代模型,以减少所需的仿真数。其次,它使用随机策略生成器来探索满足约束的各种解决方案空间。我们将它们结合在一起,将它们结合在DYNA风格的优化框架中,我们称之为Dynaopt,并在两阶段操作放大器的电路基准上进行了经验评估性能。结果表明,与使用20,000个电路模拟训练策略的无模型方法相比,Dynaopt通过仅使用500个模拟从头开始学习,从而实现了更好的性能。
In this work, we present a learning based approach to analog circuit design, where the goal is to optimize circuit performance subject to certain design constraints. One of the aspects that makes this problem challenging to optimize, is that measuring the performance of candidate configurations with simulation can be computationally expensive, particularly in the post-layout design. Additionally, the large number of design constraints and the interaction between the relevant quantities makes the problem complex. Therefore, to better facilitate supporting the human designers, it is desirable to gain knowledge about the whole space of feasible solutions. In order to tackle these challenges, we take inspiration from model-based reinforcement learning and propose a method with two key properties. First, it learns a reward model, i.e., surrogate model of the performance approximated by neural networks, to reduce the required number of simulation. Second, it uses a stochastic policy generator to explore the diverse solution space satisfying constraints. Together we combine these in a Dyna-style optimization framework, which we call DynaOpt, and empirically evaluate the performance on a circuit benchmark of a two-stage operational amplifier. The results show that, compared to the model-free method applied with 20,000 circuit simulations to train the policy, DynaOpt achieves even much better performance by learning from scratch with only 500 simulations.