论文标题
与审慎语言模型合奏的短距离得分
Short-answer scoring with ensembles of pretrained language models
论文作者
论文摘要
我们使用Kaggle自动化的短答案评分数据集研究了验证的基于变压器的语言模型的合奏的有效性。我们对流行的小型,基础和大型基于变压器的语言模型进行了微调,并在数据集中训练一个功能基本模型,以测试这些模型的集合。我们在训练中使用了早期停止机制和超参数优化。我们观察到,通常,较大的型号的性能稍好一些,但是,它们仍然没有最先进的结果。一旦我们考虑了模型的集合,就会有许多大型网络的集合,这些网络确实会产生最先进的结果,但是,这些集合太大了,无法现实地将其放置在生产环境中。
We investigate the effectiveness of ensembles of pretrained transformer-based language models on short answer questions using the Kaggle Automated Short Answer Scoring dataset. We fine-tune a collection of popular small, base, and large pretrained transformer-based language models, and train one feature-base model on the dataset with the aim of testing ensembles of these models. We used an early stopping mechanism and hyperparameter optimization in training. We observe that generally that the larger models perform slightly better, however, they still fall short of state-of-the-art results one their own. Once we consider ensembles of models, there are ensembles of a number of large networks that do produce state-of-the-art results, however, these ensembles are too large to realistically be put in a production environment.