论文标题
学习不学习:自然与养育硅
Learning Not to Learn: Nature versus Nurture in Silico
论文作者
论文摘要
动物配备了丰富的感官,行为和运动技能的先天曲目,使它们能够在出生后立即与世界互动。同时,许多行为具有很高的适应性,可以通过学习量身定制为特定环境。在这项工作中,我们使用数学分析和元学习(或“学习学习”)的框架来回答何时学习这种自适应策略是有益的,何时何时硬编码启发式行为。我们发现,生态不确定性,任务复杂性和代理人的生命周期的相互作用对代理商进行的元学习的摊销贝叶斯推论具有至关重要的影响。存在两个制度:一种元学习产生的学习算法,该算法实现了与任务有关的信息融合,而第二个制度中,元学习烙印具有启发式或“硬编码”的行为。进一步的分析表明,非自适应行为不仅是对各个个体稳定的环境方面的最佳选择,而且在对环境适应的情况下,实际上将是非常有益的,而且无法做到足够快,无法在其余寿命内进行利用。因此,硬编码的行为不仅应该是那些始终有效的行为,而且应该是那些在合理的时间范围内学习的行为。
Animals are equipped with a rich innate repertoire of sensory, behavioral and motor skills, which allows them to interact with the world immediately after birth. At the same time, many behaviors are highly adaptive and can be tailored to specific environments by means of learning. In this work, we use mathematical analysis and the framework of meta-learning (or 'learning to learn') to answer when it is beneficial to learn such an adaptive strategy and when to hard-code a heuristic behavior. We find that the interplay of ecological uncertainty, task complexity and the agents' lifetime has crucial effects on the meta-learned amortized Bayesian inference performed by an agent. There exist two regimes: One in which meta-learning yields a learning algorithm that implements task-dependent information-integration and a second regime in which meta-learning imprints a heuristic or 'hard-coded' behavior. Further analysis reveals that non-adaptive behaviors are not only optimal for aspects of the environment that are stable across individuals, but also in situations where an adaptation to the environment would in fact be highly beneficial, but could not be done quickly enough to be exploited within the remaining lifetime. Hard-coded behaviors should hence not only be those that always work, but also those that are too complex to be learned within a reasonable time frame.