迈向更有效的机器翻译评估

论文标题

迈向更有效的机器翻译评估

Toward More Effective Human Evaluation for Machine Translation

论文作者

Saldías, Belén, Foster, George, Freitag, Markus, Tan, Qijun

论文摘要

文本生成技术（例如机器翻译）的改进需要更加昂贵且耗时的人类评估程序，以确保准确的信号。我们通过减少必须注释的文本段数量来准确预测完整测试集的分数来研究一种简单的方法来降低成本。使用采样方法，我们证明了文档成员资格和自动指标中的信息可以帮助改善估计值，而纯粹的随机抽样基线。我们通过利用分层采样和控制变体实现平均绝对误差的20％的收益。我们的技术可以改善由固定注释预算制定的估计值，易于实施，并且可以应用于与我们研究的结构相似的结构的任何问题。

Improvements in text generation technologies such as machine translation have necessitated more costly and time-consuming human evaluation procedures to ensure an accurate signal. We investigate a simple way to reduce cost by reducing the number of text segments that must be annotated in order to accurately predict a score for a complete test set. Using a sampling approach, we demonstrate that information from document membership and automatic metrics can help improve estimates compared to a pure random sampling baseline. We achieve gains of up to 20% in average absolute error by leveraging stratified sampling and control variates. Our techniques can improve estimates made from a fixed annotation budget, are easy to implement, and can be applied to any problem with structure similar to the one we study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题