关于软件项目中情绪的主观性：预先标记的数据集用于情感分析的可靠性？

论文标题

关于软件项目中情绪的主观性：预先标记的数据集用于情感分析的可靠性？

On the Subjectivity of Emotions in Software Projects: How Reliable are Pre-Labeled Data Sets for Sentiment Analysis?

论文作者

Herrmann, Marc, Obaidi, Martin, Chazette, Larissa, Klünder, Jil

论文摘要

软件项目的社会方面对于研究和实践变得越来越重要。不同的方法分析了开发团队的情感，从简单地要求团队到有关基于文本的交流的所谓情感分析。这些情感分析工具是使用来自不同来源的预先标记的数据集（包括GitHub和Stack Overflow）培训的。在本文中，我们研究了数据集中的语句标签是否与软件项目团队的潜在成员的感知相吻合。根据国际调查，我们将94名参与者的中位数观念与预先标记的数据集以及每个参与者与预定义标签的一致性进行了比较。我们的结果指出了三个显着的发现：（1）尽管中位值与62.5％的数据集的预定义标签相吻合，但我们观察到单个参与者的评级和标签之间存在巨大差异；（2）没有一个参与者完全同意预定义的标签；（3）标签基于指南的数据集的性能优于临时标记的数据集。

Social aspects of software projects become increasingly important for research and practice. Different approaches analyze the sentiment of a development team, ranging from simply asking the team to so-called sentiment analysis on text-based communication. These sentiment analysis tools are trained using pre-labeled data sets from different sources, including GitHub and Stack Overflow. In this paper, we investigate if the labels of the statements in the data sets coincide with the perception of potential members of a software project team. Based on an international survey, we compare the median perception of 94 participants with the pre-labeled data sets as well as every single participant's agreement with the predefined labels. Our results point to three remarkable findings: (1) Although the median values coincide with the predefined labels of the data sets in 62.5% of the cases, we observe a huge difference between the single participant's ratings and the labels; (2) there is not a single participant who totally agrees with the predefined labels; and (3) the data set whose labels are based on guidelines performs better than the ad hoc labeled data set.

下载PDF全文

下载文献需遵守相关版权规定

论文标题