论文标题
最佳亚构造提取的近似算法
An Approximation Algorithm for Optimal Subarchitecture Extraction
论文作者
论文摘要
我们考虑为选择的深神经网络找到一组架构参数的问题,该网络在三个指标下是最佳的:参数大小,推理速度和错误率。在本文中,我们正式地陈述了问题,并提出了一种近似算法,对于大量实例,该算法就像fptas一样,近似误差为$ρ\ leq | {1-ε} | $,并且在$ o(| o o o(| eC + | + | { s^{3/2})))$ step,其中$ε$和$ s $是输入参数; $ | {b} | $是批处理大小; $ | {w^*_ t} | $表示最大重量设置分配的基数;和$ |ξ| $和$ |θ| $分别是候选体系结构和超参数空间的基础。
We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate. In this paper we state the problem formally, and present an approximation algorithm that, for a large subset of instances behaves like an FPTAS with an approximation error of $ρ\leq |{1- ε}|$, and that runs in $O(|Ξ| + |{W^*_T}|(1 + |Θ||{B}||Ξ|/({ε\, s^{3/2})}))$ steps, where $ε$ and $s$ are input parameters; $|{B}|$ is the batch size; $|{W^*_T}|$ denotes the cardinality of the largest weight set assignment; and $|Ξ|$ and $|Θ|$ are the cardinalities of the candidate architecture and hyperparameter spaces, respectively.