低资源任务的KL正规化归一化框架

论文标题

低资源任务的KL正规化归一化框架

KL Regularized Normalization Framework for Low Resource Tasks

论文作者

Kumar, Neeraj, Narang, Ankur, Lall, Brejesh

论文摘要

大型预训练的模型，例如BERT，GPT和WAV2VEC，已经显示出可转移到各种下游任务的学习表示的巨大潜力。由于资源和时间的可用性有限，很难获得大量监督数据。鉴于此，通过微调，线性探测或在低资源设置中采用大量的下游任务，在采用大型预训练的数据集中进行了大量研究。标准化技术对于加速训练和改善深度神经网络的概括至关重要，并且已成功地用于多种应用中。已经提出了许多标准化技术，但是低资源下游NLP中标准化的成功和语音任务是有限的。原因之一是无法通过重新归一化参数来捕获表现力。我们提出了kullbackleibler（KL）正规化归一化（KL-NORM），这使得归一化数据的行为良好，并有助于更好地泛化，因为它可以减少过度拟合，从而在域分布中可以很好地概括，并消除了无关紧要的偏见，并且在模型参数和内置的高架上增加了可忽略的偏见，并且具有明显的增加。对多个低资源NLP和语音任务进行了详细的实验评估，与其他流行的归一化和正则化技术相比，KL-NORM的出色表现。

Large pre-trained models, such as Bert, GPT, and Wav2Vec, have demonstrated great potential for learning representations that are transferable to a wide variety of downstream tasks . It is difficult to obtain a large quantity of supervised data due to the limited availability of resources and time. In light of this, a significant amount of research has been conducted in the area of adopting large pre-trained datasets for diverse downstream tasks via fine tuning, linear probing, or prompt tuning in low resource settings. Normalization techniques are essential for accelerating training and improving the generalization of deep neural networks and have been successfully used in a wide variety of applications. A lot of normalization techniques have been proposed but the success of normalization in low resource downstream NLP and speech tasks is limited. One of the reasons is the inability to capture expressiveness by rescaling parameters of normalization. We propose KullbackLeibler(KL) Regularized normalization (KL-Norm) which make the normalized data well behaved and helps in better generalization as it reduces over-fitting, generalises well on out of domain distributions and removes irrelevant biases and features with negligible increase in model parameters and memory overheads. Detailed experimental evaluation on multiple low resource NLP and speech tasks, demonstrates the superior performance of KL-Norm as compared to other popular normalization and regularization techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题