论文标题
变形金刚学习自动机的快捷方式
Transformers Learn Shortcuts to Automata
论文作者
论文摘要
算法推理需要通过像Turing Machine这样的经常性计算模型来理解的功能。但是,变压器模型虽然缺乏复发,但能够使用比推理步骤数量少得多的层次进行推理。这就提出了一个问题:这些浅层和非循环模型学到了哪些解决方案?我们发现,低深度变压器可以代表任何有限状态自动机的计算(因此,任何有限的内存算法),通过层次重新聚集其复发动力学。我们的理论结果是快捷解决方案的特征,即具有$ O(t)$层的变压器可以精确地在长度$ t $的输入序列上复制自动机的计算。我们发现多项式大小的$ O(\ log t)$ - 深度解决方案始终存在;此外,$ o(1)$ - 深度模拟器非常普遍,可以使用Krohn-Rhodes理论和电路复杂性的工具来理解。从经验上讲,我们通过训练变压器进行综合实验来模拟各种自动机,并证明可以通过标准培训来学习快捷解决方案。我们进一步研究了这些解决方案的脆弱性,并提出了潜在的缓解。
Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine. However, Transformer models, while lacking recurrence, are able to perform such reasoning using far fewer layers than the number of reasoning steps. This raises the question: what solutions are learned by these shallow and non-recurrent models? We find that a low-depth Transformer can represent the computations of any finite-state automaton (thus, any bounded-memory algorithm), by hierarchically reparameterizing its recurrent dynamics. Our theoretical results characterize shortcut solutions, whereby a Transformer with $o(T)$ layers can exactly replicate the computation of an automaton on an input sequence of length $T$. We find that polynomial-sized $O(\log T)$-depth solutions always exist; furthermore, $O(1)$-depth simulators are surprisingly common, and can be understood using tools from Krohn-Rhodes theory and circuit complexity. Empirically, we perform synthetic experiments by training Transformers to simulate a wide variety of automata, and show that shortcut solutions can be learned via standard training. We further investigate the brittleness of these solutions and propose potential mitigations.