论文标题
是否激活:学习自定义激活
Activate or Not: Learning Customized Activation
论文作者
论文摘要
我们提出了一个简单,有效且一般的激活函数,我们称为ACON,该功能学会激活神经元。有趣的是,我们发现Swish是最近受欢迎的NAS搜索激活,可以解释为对Relu的平滑近似。从直觉上讲,我们以同样的方式将更为普遍的麦克斯特家族近似于我们新颖的Acon家族,这显着改善了性能并使Swish成为Acon的特殊情况。接下来,我们介绍Meta-Acon,该Meta-Acon明确学会了优化非线性(激活)和线性(失活)之间的参数切换,并提供了一个新的设计空间。通过简单地更改激活函数,我们在小型模型和高度优化的大型模型上都显示出其有效性(例如,Imabilenet-0.25和Resnet-152的Imagenet Top-1精度率分别提高了6.7%和1.8%)。此外,我们的新型ACON可以自然地转移到对象检测和语义分割中,表明ACON在各种任务中都是有效的替代方法。代码可在https://github.com/nmaac/acon上找到。
We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not. Interestingly, we find Swish, the recent popular NAS-searched activation, can be interpreted as a smooth approximation to ReLU. Intuitively, in the same way, we approximate the more general Maxout family to our novel ACON family, which remarkably improves the performance and makes Swish a special case of ACON. Next, we present meta-ACON, which explicitly learns to optimize the parameter switching between non-linear (activate) and linear (inactivate) and provides a new design space. By simply changing the activation function, we show its effectiveness on both small models and highly optimized large models (e.g. it improves the ImageNet top-1 accuracy rate by 6.7% and 1.8% on MobileNet-0.25 and ResNet-152, respectively). Moreover, our novel ACON can be naturally transferred to object detection and semantic segmentation, showing that ACON is an effective alternative in a variety of tasks. Code is available at https://github.com/nmaac/acon.