论文标题

带有能源友好操作的注意机制

Attention Mechanism with Energy-Friendly Operations

论文作者

Wan, Yu, Yang, Baosong, Liu, Dayiheng, Xiao, Rong, Wong, Derek F., Zhang, Haibo, Chen, Boxing, Chao, Lidia S.

论文摘要

注意机制已成为自然语言处理模型中的主要模块。它在计算上是密集型的,取决于巨大的耗电乘积。在本文中,我们将注意机制的变体从能耗方面重新考虑。在得出的结论是,几个能源友好的操作的能源成本远远低于其乘法对应物,我们通过用选择性操作或添加替换乘法来建立一个新颖的注意模型。三个机器翻译任务的经验结果表明,针对香草一号的拟议模型可实现竞争力的准确性,同时节省99 \%和66 \%的能量,并在对齐计算过程中和整个注意力程序过程中。代码可在以下网址提供:https://github.com/nlp2ct/e-att。

Attention mechanism has become the dominant module in natural language processing models. It is computationally intensive and depends on massive power-hungry multiplications. In this paper, we rethink variants of attention mechanism from the energy consumption aspects. After reaching the conclusion that the energy costs of several energy-friendly operations are far less than their multiplication counterparts, we build a novel attention model by replacing multiplications with either selective operations or additions. Empirical results on three machine translation tasks demonstrate that the proposed model, against the vanilla one, achieves competitable accuracy while saving 99\% and 66\% energy during alignment calculation and the whole attention procedure. Code is available at: https://github.com/NLP2CT/E-Att.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源