关于神经建筑搜索编码的研究

论文标题

关于神经建筑搜索编码的研究

A Study on Encodings for Neural Architecture Search

论文作者

White, Colin, Neiswanger, Willie, Nolen, Sam, Savani, Yash

论文摘要

在过去的几年中，神经建筑搜索（NAS）已被广泛研究。一种流行的方法是将搜索空间中的每个神经体系结构表示为有向的无环图（DAG），然后通过编码邻接矩阵和操作列表来搜索所有DAG，作为一组超参数。最近的工作表明，即使对每个体系结构的编码方式也有很小的变化，也会对NAS算法的性能产生重大影响。在这项工作中，我们介绍了关于NAS建筑编码效果的首次正式研究，包括理论基础和经验研究。首先，我们正式定义体系结构编码，并为我们研究的编码的可伸缩性提供理论表征，然后确定NAS算法采用的主要编码依赖性子例程，并运行实验以显示哪些编码最有效地适用于每个子例程的许多流行算法。该实验是一项用于先前工作的消融研究，解散了算法和基于编码的贡献，以及未来工作的指南。我们的结果表明，NAS编码是一个重要的设计决策，可以对整体绩效产生重大影响。我们的代码可在https://github.com/naszilla/nas-concodings上找到。

Neural architecture search (NAS) has been extensively studied in the past few years. A popular approach is to represent each neural architecture in the search space as a directed acyclic graph (DAG), and then search over all DAGs by encoding the adjacency matrix and list of operations as a set of hyperparameters. Recent work has demonstrated that even small changes to the way each architecture is encoded can have a significant effect on the performance of NAS algorithms. In this work, we present the first formal study on the effect of architecture encodings for NAS, including a theoretical grounding and an empirical study. First we formally define architecture encodings and give a theoretical characterization on the scalability of the encodings we study Then we identify the main encoding-dependent subroutines which NAS algorithms employ, running experiments to show which encodings work best with each subroutine for many popular algorithms. The experiments act as an ablation study for prior work, disentangling the algorithmic and encoding-based contributions, as well as a guideline for future work. Our results demonstrate that NAS encodings are an important design decision which can have a significant impact on overall performance. Our code is available at https://github.com/naszilla/nas-encodings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题