论文标题
作者的情感预测
Author's Sentiment Prediction
论文作者
论文摘要
我们介绍了Persent,这是作者对新闻文章中主要实体表达的情绪的人群注释的数据集。该数据集还包括段落级别的情感注释,以为任务提供更多细粒度的监督。我们的多个强基础基准测试表明,这是一项困难的分类任务。结果还表明,简单地对BERT进行微调文档级表示不足,这对于此任务不足。做出段落级别的决定并在整个文档中汇总它们也无效。我们提出了经验和定性分析,以说明该数据集提出的具体挑战。我们以5.3k文档和38k段落发布该数据集,其中涵盖3.2k独特的实体,以此作为实体情感分析的挑战。
We introduce PerSenT, a dataset of crowd-sourced annotations of the sentiment expressed by the authors towards the main entities in news articles. The dataset also includes paragraph-level sentiment annotations to provide more fine-grained supervision for the task. Our benchmarks of multiple strong baselines show that this is a difficult classification task. The results also suggest that simply fine-tuning document-level representations from BERT isn't adequate for this task. Making paragraph-level decisions and aggregating them over the entire document is also ineffective. We present empirical and qualitative analyses that illustrate the specific challenges posed by this dataset. We release this dataset with 5.3k documents and 38k paragraphs covering 3.2k unique entities as a challenge in entity sentiment analysis.