论文标题

一种改进的方法,用于估算社交媒体上具有文本属性的社交POI边界

An Improved Approach for Estimating Social POI Boundaries With Textual Attributes on Social Media

论文作者

Tran, Cong, Vu, Dung D., Shin, Won-Yong

论文摘要

通过利用社交媒体上的文本属性来执行基于密度的聚类的探索不足。在本文中,我们旨在发现作为凸多边形形成的社会点点(POI)边界。更具体地说,我们提出了一种新的方法和算法,这是基于我们对社会POI边界估算(Sobest)的早期工作。这种哭泣的方法考虑了地理区域内相关和无关的记录,其中相关记录包含POI名称或其文本字段中的变体。我们的研究是出于以下经验观察的激励:每种POI的固定代表性坐标基本上假设可能与某些POI的估计社会POI边界的质心相距甚远。因此,在这种情况下使用Sobest可能会导致边界估计质量(BEQ)的性能不令人满意,该质量是$ f $量的函数。为了解决这个问题,我们通过允许更新$ c $来同时找到一个圆圈和POI的代表坐标$ c $的联合优化问题。随后,我们设计了一种迭代的Sobest(I-Sobest)算法,这使我们能够为某些POI获得更高程度的BEQ。所提出的I-Sobest算法的计算复杂性显示与记录数量线性扩展。我们证明了我们的算法优于包括原始Sobest(包括原始Sobest)的聚类方法的优势。

It has been insufficiently explored how to perform density-based clustering by exploiting textual attributes on social media. In this paper, we aim at discovering a social point-of-interest (POI) boundary, formed as a convex polygon. More specifically, we present a new approach and algorithm, built upon our earlier work on social POI boundary estimation (SoBEst). This SoBEst approach takes into account both relevant and irrelevant records within a geographic area, where relevant records contain a POI name or its variations in their text field. Our study is motivated by the following empirical observation: a fixed representative coordinate of each POI that SoBEst basically assumes may be far away from the centroid of the estimated social POI boundary for certain POIs. Thus, using SoBEst in such cases may possibly result in unsatisfactory performance on the boundary estimation quality (BEQ), which is expressed as a function of the $F$-measure. To solve this problem, we formulate a joint optimization problem of simultaneously finding the radius of a circle and the POI's representative coordinate $c$ by allowing to update $c$. Subsequently, we design an iterative SoBEst (I-SoBEst) algorithm, which enables us to achieve a higher degree of BEQ for some POIs. The computational complexity of the proposed I-SoBEst algorithm is shown to scale linearly with the number of records. We demonstrate the superiority of our algorithm over competing clustering methods including the original SoBEst.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源