论文标题
符号回归的基因池最佳混合进化算法中的系数突变
Coefficient Mutation in the Gene-pool Optimal Mixing Evolutionary Algorithm for Symbolic Regression
论文作者
论文摘要
当前,基因池最佳混合进化算法(GP-GOMEA)的遗传编程版本是符号回归(SR)的最佳表现算法之一。 GP-GOMEA的一个关键优势是进行变异的方式,该方式动态地适应了种群中模式的出现。但是,GP-GOMEA缺乏优化系数的机制。在本文中,我们研究了如何将其优化系数的相当简单方法集成到GP-GOMEA中。特别是,我们考虑了高斯系数突变的两个变体。我们在23个基准问题上使用不同的设置进行了实验,并使用机器学习来估计系数突变的哪些方面最重要。我们发现最重要的方面是,系数突变尝试的数量需要与GP-GOMEA执行的混合操作数量相称。我们将GP-GOMEA采用最佳性系数突变方法应用于SRBENCH的数据集,Srbench是一个大型SR基准测试,为此,已知基础真相。我们发现系数突变可以帮助重新发现基础方程,但只有在未将噪声添加到目标变量中时。在存在噪声的情况下,带有系数突变的GP-GOMEA会发现替代方案,但相似的准确方程。
Currently, the genetic programming version of the gene-pool optimal mixing evolutionary algorithm (GP-GOMEA) is among the top-performing algorithms for symbolic regression (SR). A key strength of GP-GOMEA is its way of performing variation, which dynamically adapts to the emergence of patterns in the population. However, GP-GOMEA lacks a mechanism to optimize coefficients. In this paper, we study how fairly simple approaches for optimizing coefficients can be integrated into GP-GOMEA. In particular, we considered two variants of Gaussian coefficient mutation. We performed experiments using different settings on 23 benchmark problems, and used machine learning to estimate what aspects of coefficient mutation matter most. We find that the most important aspect is that the number of coefficient mutation attempts needs to be commensurate with the number of mixing operations that GP-GOMEA performs. We applied GP-GOMEA with the best-performing coefficient mutation approach to the data sets of SRBench, a large SR benchmark, for which a ground-truth underlying equation is known. We find that coefficient mutation can help re-discovering the underlying equation by a substantial amount, but only when no noise is added to the target variable. In the presence of noise, GP-GOMEA with coefficient mutation discovers alternative but similarly-accurate equations.