论文标题

隐私感知机器学习任务的人群标签

Privacy-Aware Crowd Labelling for Machine Learning Tasks

论文作者

Haralabopoulos, Giannis, Anagnostopoulos, Ioannis

论文摘要

在线社交媒体的广泛使用强调了隐私在数字空间中的重要性。随着越来越多的科学家分析这些平台中创建的数据,隐私问题已扩展到学术界内的数据使用情况。尽管文本分析是具有多种应用程序的学术文献中有据可查的主题,但是确保用户生成的内容的隐私已被忽略。大多数情感分析方法都需要情感标签,这可以通过众包来获得,在这些众包中,非专业人士对科学任务做出了贡献。文本本身必须暴露于第三方才能被标记。为了减少在线用户信息的曝光率,我们提出了一种基于众包中的不同应用程序的文本标签方法的隐私。我们以不同级别的隐私级别进行转换,并分析有关标签相关性和一致性的转换有效性。我们的结果表明,可以在标签中实施隐私,保留传统标签的注释多样性和主观性。

The extensive use of online social media has highlighted the importance of privacy in the digital space. As more scientists analyse the data created in these platforms, privacy concerns have extended to data usage within the academia. Although text analysis is a well documented topic in academic literature with a multitude of applications, ensuring privacy of user-generated content has been overlooked. Most sentiment analysis methods require emotion labels, which can be obtained through crowdsourcing, where non-expert individuals contribute to scientific tasks. The text itself has to be exposed to third parties in order to be labelled. In an effort to reduce the exposure of online users' information, we propose a privacy preserving text labelling method for varying applications, based in crowdsourcing. We transform text with different levels of privacy, and analyse the effectiveness of the transformation with regards to label correlation and consistency. Our results suggest that privacy can be implemented in labelling, retaining the annotational diversity and subjectivity of traditional labelling.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源