可视化分类结果的类图

论文标题

可视化分类结果的类图

Class maps for visualizing classification results

论文作者

Raymaekers, Jakob, Rousseeuw, Peter J., Hubert, Mia

论文摘要

分类是统计和机器学习的主要工具。分类方法首先处理具有给定类（标签）的训练对象集，其目的是将新对象分配给其中一个类。在训练数据或测试数据上运行结果预测方法时，可能会预测对象在与给定标签不同的类中。这有时称为标签偏见，并提出了一个问题是否被标记了。所提出的类图反映了对象属于替代类的可能性，它与给定类中的其他对象有多远，以及某些对象是否远离所有类别。目的是可视化分类结果的各个方面，以获取数据中的洞察力。该显示是用于判别分析的构建的，K-Nearest邻居分类器，支持向量机，逻辑回归和耦合成对分类。它在几个基准数据集上进行了说明，其中包括一些有关图像和文本的信息。

Classification is a major tool of statistics and machine learning. A classification method first processes a training set of objects with given classes (labels), with the goal of afterward assigning new objects to one of these classes. When running the resulting prediction method on the training data or on test data, it can happen that an object is predicted to lie in a class that differs from its given label. This is sometimes called label bias, and raises the question whether the object was mislabeled. The proposed class map reflects the probability that an object belongs to an alternative class, how far it is from the other objects in its given class, and whether some objects lie far from all classes. The goal is to visualize aspects of the classification results to obtain insight in the data. The display is constructed for discriminant analysis, the k-nearest neighbor classifier, support vector machines, logistic regression, and coupling pairwise classifications. It is illustrated on several benchmark datasets, including some about images and texts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题