在以任务为导向的对话框中加速自然语言理解

论文标题

在以任务为导向的对话框中加速自然语言理解

Accelerating Natural Language Understanding in Task-Oriented Dialog

论文作者

Ahuja, Ojas, Desai, Shrey

论文摘要

以任务为导向的对话模型通常利用复杂的神经体系结构和大规模的预训练的变压器来在流行的自然语言理解基准上实现最先进的性能。但是，这些模型通常具有超过数千万的参数，因此无法在资源效率是主要问题的情况下部署设备。在这项工作中，我们表明，用结构化修剪压缩的简单卷积模型在ATIS和STIP上的BERT在很大程度上可比较的结果，其参数低于100K。此外，我们在CPU上进行加速实验，在该实验中，我们观察到多任务模型可预测的意图和插槽比Distilbert快近63倍。

Task-oriented dialog models typically leverage complex neural architectures and large-scale, pre-trained Transformers to achieve state-of-the-art performance on popular natural language understanding benchmarks. However, these models frequently have in excess of tens of millions of parameters, making them impossible to deploy on-device where resource-efficiency is a major concern. In this work, we show that a simple convolutional model compressed with structured pruning achieves largely comparable results to BERT on ATIS and Snips, with under 100K parameters. Moreover, we perform acceleration experiments on CPUs, where we observe our multi-task model predicts intents and slots nearly 63x faster than even DistilBERT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题