论文标题

最佳亚构造提取的近似算法

An Approximation Algorithm for Optimal Subarchitecture Extraction

论文作者

de Wynter, Adrian

论文摘要

我们考虑为选择的深神经网络找到一组架构参数的问题,该网络在三个指标下是最佳的:参数大小,推理速度和错误率。在本文中,我们正式地陈述了问题,并提出了一种近似算法,对于大量实例,该算法就像fptas一样,近似误差为$ρ\ leq | {1-ε} | $,并且在$ o(| o o o(| eC + | + | { s^{3/2})))$ step,其中$ε$和$ s $是输入参数; $ | {b} | $是批处理大小; $ | {w^*_ t} | $表示最大重量设置分配的基数;和$ |ξ| $和$ |θ| $分别是候选体系结构和超参数空间的基础。

We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate. In this paper we state the problem formally, and present an approximation algorithm that, for a large subset of instances behaves like an FPTAS with an approximation error of $ρ\leq |{1- ε}|$, and that runs in $O(|Ξ| + |{W^*_T}|(1 + |Θ||{B}||Ξ|/({ε\, s^{3/2})}))$ steps, where $ε$ and $s$ are input parameters; $|{B}|$ is the batch size; $|{W^*_T}|$ denotes the cardinality of the largest weight set assignment; and $|Ξ|$ and $|Θ|$ are the cardinalities of the candidate architecture and hyperparameter spaces, respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源