论文标题
霍克斯通过文本的判别建模过程分类
Hawkes Process Classification through Discriminative Modeling of Text
论文作者
论文摘要
社交媒体为用户提供了一个平台,可以收集和共享信息并随着新闻的更新。这样的网络还为用户提供了一个可以进行对话的平台。但是,这样的微博平台(例如Twitter)限制了文本的长度。由于在此类帖子中缺乏足够的单词出现,因此使用自然语言处理(NLP)的标准工具(NLP)是一项具有挑战性的任务。此外,社交媒体中帖子的高复杂性和动态使文本分类成为一个具有挑战性的问题。但是,考虑过去标签和与帖子相关的时间的其他提示可能有助于以更好的方式进行文本分类。为了解决这个问题,我们提出了基于Hawkes流程(HP)的模型,该模型自然可以包含时间功能和过去的标签以及用于改进短文本分类的文本功能。特别是,我们提出了一种歧视方法来模拟HP中的文本,其中文本特征参数化基本强度和/或触发内核。另一个主要贡献是将内核视为时间和文本的函数,并进一步使用神经网络对内核进行建模。这使建模并有效地学习文本以及推文分类的历史影响。我们证明了拟议技术在谣言立场分类的标准基准上的优势。
Social media has provided a platform for users to gather and share information and stay updated with the news. Such networks also provide a platform to users where they can engage in conversations. However, such micro-blogging platforms like Twitter restricts the length of text. Due to paucity of sufficient word occurrences in such posts, classification of this information is a challenging task using standard tools of natural language processing (NLP). Moreover, high complexity and dynamics of the posts in social media makes text classification a challenging problem. However, considering additional cues in the form of past labels and times associated with the post can be potentially helpful for performing text classification in a better way. To address this problem, we propose models based on the Hawkes process (HP) which can naturally incorporate the temporal features and past labels along with textual features for improving short text classification. In particular, we propose a discriminative approach to model text in HP where the text features parameterize the base intensity and/or the triggering kernel. Another major contribution is to consider kernel to be a function of both time and text, and further use a neural network to model the kernel. This enables modelling and effectively learning the text along with the historical influences for tweet classification. We demonstrate the advantages of the proposed techniques on standard benchmarks for rumour stance classification.