动态安装：扩展巨大尺度应用的张量

论文标题

动态安装：扩展巨大尺度应用的张量

DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications

论文作者

Zeng, Yun, Zuo, Siqi, Shen, Dongcai

论文摘要

当今具有稀疏特征的深度学习模型的局限性之一源于其投入的预定性质，这需要在培训之前定义字典。在本文的情况下，我们提出了一个理论和工作系统设计，以消除此限制，并表明所得模型能够在更大的规模上执行更好，有效地运行。具体而言，我们通过将模型的内容从其形式分离出来，以分别解决架构的演变和内存增长来实现这一目标。为了有效地处理模型增长，我们提出了一种称为DynamicCell的新神经元模型，从自由能原理[15]中汲取灵感，以引入反应概念到排放非消化的能量，该概念还涵盖了基于梯度下降的方法作为特殊情况。我们通过将新服务器引入TensorFlow来接管涉及模型增长的大多数工作来实现DynamicCell。因此，它使任何现有的深度学习模型能够有效地处理任意数量的不同稀疏特征（例如搜索查询），并不断地生长而不重新定义模型。最值得注意的是，我们的一种模型之一已经可靠地在生产中运行了一年多了，它能够为Google Smart广告系列的广告商提出高质量的关键字，并基于具有挑战性的指标（证据表明数据驱动的，自我发展的系统可能会超过传统规则基于规则的方法的绩效。

One of the limitations of deep learning models with sparse features today stems from the predefined nature of their input, which requires a dictionary be defined prior to the training. With this paper we propose both a theory and a working system design which remove this limitation, and show that the resulting models are able to perform better and efficiently run at a much larger scale. Specifically, we achieve this by decoupling a model's content from its form to tackle architecture evolution and memory growth separately. To efficiently handle model growth, we propose a new neuron model, called DynamicCell, drawing inspiration from from the free energy principle [15] to introduce the concept of reaction to discharge non-digestive energy, which also subsumes gradient descent based approaches as its special cases. We implement DynamicCell by introducing a new server into TensorFlow to take over most of the work involving model growth. Consequently, it enables any existing deep learning models to efficiently handle arbitrary number of distinct sparse features (e.g., search queries), and grow incessantly without redefining the model. Most notably, one of our models, which has been reliably running in production for over a year, is capable of suggesting high quality keywords for advertisers of Google Smart Campaigns and achieved significant accuracy gains based on a challenging metric -- evidence that data-driven, self-evolving systems can potentially exceed the performance of traditional rule-based approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题