神经网络的高效，基于不确定性的节奏文本分类器

论文标题

神经网络的高效，基于不确定性的节奏文本分类器

Efficient, Uncertainty-based Moderation of Neural Networks Text Classifiers

论文作者

Andersen, Jakob Smedegaard, Maalej, Walid

论文摘要

为了最大程度地提高准确性并提高文本分类器的整体接受度，我们为分类器输出的有效，内部适度的框架提出了一个框架。我们的框架侧重于用例，在这种情况下，现代神经网络分类器的F1得分（约90％）在实践中仍然不适用。我们建议一种半自动化的方法，该方法使用预测不确定性来传递不自信的，可能对人类主持人进行了不正确的分类。为了最大程度地减少工作量，我们将人类调节的数据限制为准确性饱和并进一步的人类努力不会导致实质性改善。一系列基于三个不同数据集和三个最先进的分类器的基准测试实验表明，与随机调节相比，我们的框架可以将分类F1分数提高5.1％至11.2％（最高约为98％至99％），而将调节负载降低到最高73.3％。

To maximize the accuracy and increase the overall acceptance of text classifiers, we propose a framework for the efficient, in-operation moderation of classifiers' output. Our framework focuses on use cases in which F1-scores of modern Neural Networks classifiers (ca.~90%) are still inapplicable in practice. We suggest a semi-automated approach that uses prediction uncertainties to pass unconfident, probably incorrect classifications to human moderators. To minimize the workload, we limit the human moderated data to the point where the accuracy gains saturate and further human effort does not lead to substantial improvements. A series of benchmarking experiments based on three different datasets and three state-of-the-art classifiers show that our framework can improve the classification F1-scores by 5.1 to 11.2% (up to approx.~98 to 99%), while reducing the moderation load up to 73.3% compared to a random moderation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题