论文标题

评估序列长度学习对变压器编码器模型分类任务的影响

Assessing the Impact of Sequence Length Learning on Classification Tasks for Transformer Encoder Models

论文作者

Baillargeon, Jean-Thomas, Lamontagne, Luc

论文摘要

每当来自不同类别的观察结果具有不同的长度分布时,使用变压器体系结构的分类算法就会受到序列长度学习问题的影响。此问题使模型使用序列长度作为预测功能,而不是依靠重要的文本信息。尽管大多数公共数据集不受此问题的影响,但医学和保险等领域的私人公司可能会带来此偏见。由于这些机器学习模型可以在关键应用中使用,因此对此序列长度特征的开发构成了整个价值链中的挑战。在本文中,我们从经验上揭示了这个问题,并目前的方法可以最大程度地减少其影响。

Classification algorithms using Transformer architectures can be affected by the sequence length learning problem whenever observations from different classes have a different length distribution. This problem causes models to use sequence length as a predictive feature instead of relying on important textual information. Although most public datasets are not affected by this problem, privately owned corpora for fields such as medicine and insurance may carry this data bias. The exploitation of this sequence length feature poses challenges throughout the value chain as these machine learning models can be used in critical applications. In this paper, we empirically expose this problem and present approaches to minimize its impacts.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源