论文标题
调查模型鲁棒性改进文本分类器的合奏方法
Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers
论文作者
论文摘要
在过去的几年中,大型预训练的语言模型表现出了出色的性能。但是,这些模型有时会从数据集中学习表面特征,并且无法推广到与培训方案不同的分布。已经提出了几种方法来减少模型对这些偏差特征的依赖,这些方法可以改善分布式设置中的模型鲁棒性。但是,现有方法通常使用固定的低容量模型来处理各种偏见特征,这些特征忽略了这些功能的可学习性。在本文中,我们分析了一组现有的偏见特征,并证明没有对所有情况最有效的单一模型。我们进一步表明,通过选择适当的偏置模型,我们可以比具有更复杂模型设计的基线获得更好的鲁棒性结果。
Large pre-trained language models have shown remarkable performance over the past few years. These models, however, sometimes learn superficial features from the dataset and cannot generalize to the distributions that are dissimilar to the training scenario. There have been several approaches proposed to reduce model's reliance on these bias features which can improve model robustness in the out-of-distribution setting. However, existing methods usually use a fixed low-capacity model to deal with various bias features, which ignore the learnability of those features. In this paper, we analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases. We further show that by choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.