论文标题
自动散点图设计优化用于聚类识别
Automatic Scatterplot Design Optimization for Clustering Identification
论文作者
论文摘要
散点图是最广泛使用的可视化技术之一。引人入胜的散点图可视化通过利用视觉感知来提高意识,以提高对数据的理解,以提高意识。散点图中的设计选择,例如图形编码或数据方面,可以直接影响诸如聚类之类的低级任务的决策质量。因此,构建框架既考虑视觉编码的感知和执行的任务,都可以优化可视化以最大程度地发挥功效。在本文中,我们提出了一种自动工具,以优化散点图的设计因子,以揭示最突出的群集结构。我们的方法利用合并树数据结构来识别簇,并优化了用于生成散点图图像的亚采样算法,采样率,标记尺寸和标记不透明的选择。我们通过用户和案例研究来验证我们的方法,这些方法表明它有效地从大参数空间提供了高质量的散点图设计。
Scatterplots are among the most widely used visualization techniques. Compelling scatterplot visualizations improve understanding of data by leveraging visual perception to boost awareness when performing specific visual analytic tasks. Design choices in scatterplots, such as graphical encodings or data aspects, can directly impact decision-making quality for low-level tasks like clustering. Hence, constructing frameworks that consider both the perceptions of the visual encodings and the task being performed enables optimizing visualizations to maximize efficacy. In this paper, we propose an automatic tool to optimize the design factors of scatterplots to reveal the most salient cluster structure. Our approach leverages the merge tree data structure to identify the clusters and optimize the choice of subsampling algorithm, sampling rate, marker size, and marker opacity used to generate a scatterplot image. We validate our approach with user and case studies that show it efficiently provides high-quality scatterplot designs from a large parameter space.