使用用户信息的问答网站中的最佳答案预测

论文标题

使用用户信息的问答网站中的最佳答案预测

Best-Answer Prediction in Q&A Sites Using User Information

论文作者

Hadfi, Rafik, Moustafa, Ahmed, Yoshino, Kai, Ito, Takayuki

论文摘要

近年来，社区问题回答（CQA）站点已大幅扩展和乘以。诸如Reddit，Quora和Stack Exchange之类的网站在有兴趣寻找各种问题的答案的人们中变得越来越流行。找到此类答案的一种实用方法是自动预测鉴于现有答案和评论的最佳候选人。进行了许多有关CQA答案预测的研究，但重点限于使用问卷的背景信息。我们使用一种新颖的方法来解决此限制，以使用发问者的背景信息和其他功能（例如文本内容或与其他参与者的关系）来预测最佳答案。我们的答案分类模型是使用堆栈交换数据集培训的，并使用曲线（AUC）指标下的区域进行了验证。实验结果表明，该提出的方法通过指出用户之间关系的重要性来补充先前的方法，尤其是在堆栈交换中不同社区的整个介入层面上。此外，我们指出的是，用户关系信息与以浅文本特征和元用力（例如时间差异）表示的信息之间几乎没有重叠。

Community Question Answering (CQA) sites have spread and multiplied significantly in recent years. Sites like Reddit, Quora, and Stack Exchange are becoming popular amongst people interested in finding answers to diverse questions. One practical way of finding such answers is automatically predicting the best candidate given existing answers and comments. Many studies were conducted on answer prediction in CQA but with limited focus on using the background information of the questionnaires. We address this limitation using a novel method for predicting the best answers using the questioner's background information and other features, such as the textual content or the relationships with other participants. Our answer classification model was trained using the Stack Exchange dataset and validated using the Area Under the Curve (AUC) metric. The experimental results show that the proposed method complements previous methods by pointing out the importance of the relationships between users, particularly throughout the level of involvement in different communities on Stack Exchange. Furthermore, we point out that there is little overlap between user-relation information and the information represented by the shallow text features and the meta-features, such as time differences.

下载PDF全文

下载文献需遵守相关版权规定

论文标题