论文标题

通过结构性硬辅助面膜修剪单击速率预测,通过野外嵌入尺寸搜索搜索

Field-wise Embedding Size Search via Structural Hard Auxiliary Mask Pruning for Click-Through Rate Prediction

论文作者

Xiao, Tesi, Xiao, Xia, Chen, Ming, Chen, Youlong

论文摘要

当训练基于深度学习的点击率预测模型时,功能嵌入是最重要的步骤之一,该模型将高维稀疏特征映射到密集的嵌入向量。就记忆使用和模型容量之间的权衡而言,经典的人工嵌入尺寸选择方法被证明是“次优”。神经体系结构搜索(NAS)中的趋势方法已经证明了他们在寻找嵌入尺寸的效率。但是,大多数现有的基于NAS的作品都有昂贵的计算成本,搜索空间维度的诅咒以及连续搜索空间和离散候选空间之间的差异。其他以非结构化方式修剪嵌入的作品无法明确降低计算成本。在本文中,为了解决这些局限性,我们提出了一种新型策略,该策略通过结构上通过硬辅助掩码在结构上修剪超网络来搜索最佳的混合尺寸嵌入方案。我们的方法旨在使用简单有效的基于梯度的方法直接在离散空间中搜索候选模型。此外,我们在嵌入表上引入正交规律性,以减少嵌入列内的相关性并增强表示能力。广泛的实验表明,它可以有效地消除冗余嵌入尺寸而不会大大损失。

Feature embeddings are one of the most essential steps when training deep learning based Click-Through Rate prediction models, which map high-dimensional sparse features to dense embedding vectors. Classic human-crafted embedding size selection methods are shown to be "sub-optimal" in terms of the trade-off between memory usage and model capacity. The trending methods in Neural Architecture Search (NAS) have demonstrated their efficiency to search for embedding sizes. However, most existing NAS-based works suffer from expensive computational costs, the curse of dimensionality of the search space, and the discrepancy between continuous search space and discrete candidate space. Other works that prune embeddings in an unstructured manner fail to reduce the computational costs explicitly. In this paper, to address those limitations, we propose a novel strategy that searches for the optimal mixed-dimension embedding scheme by structurally pruning a super-net via Hard Auxiliary Mask. Our method aims to directly search candidate models in the discrete space using a simple and efficient gradient-based method. Furthermore, we introduce orthogonal regularity on embedding tables to reduce correlations within embedding columns and enhance representation capacity. Extensive experiments demonstrate it can effectively remove redundant embedding dimensions without great performance loss.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源