论文标题
结构化数据的多个度量学习
Multiple Metric Learning for Structured Data
论文作者
论文摘要
在从结构化数据中学习度量时,我们解决了合并图形和功能空间信息的问题。现有算法通过提取图形结构的矢量摘要或为特征空间算法添加硬性约束来以不对称的方式解决问题。遵循不同的路径,我们定义了一个度量回归方案,在该方案中,我们训练差异矩阵的指标受限的线性组合。这个想法是,可以预先计算的差异度量(例如节点属性或边缘结构)获得的输入矩阵。由于模型输入是距离度量,因此我们不需要假设存在任何基础特征空间。主要的挑战是,如果允许线性组合的系数为负,则不会自动尊重度量约束(尤其是正定性和亚辅助性)。正和亚addive的约束都是线性的不等式,但是将它们施加为O(d3)的计算复杂性,其中d是输入矩阵的大小(即数据集的大小)。即使d相对较小,这也会很快过时。我们提出了一种基于图的新技术,用于在此类约束下进行优化,并表明在某些情况下,我们的方法可能会将优化过程的原始计算复杂性降低一个数量级。与现有方法相反,我们的方案适用于任何(可能是非凸的)指标限制的目标函数。
We address the problem of merging graph and feature-space information while learning a metric from structured data. Existing algorithms tackle the problem in an asymmetric way, by either extracting vectorized summaries of the graph structure or adding hard constraints to feature-space algorithms. Following a different path, we define a metric regression scheme where we train metric-constrained linear combinations of dissimilarity matrices. The idea is that the input matrices can be pre-computed dissimilarity measures obtained from any kind of available data (e.g. node attributes or edge structure). As the model inputs are distance measures, we do not need to assume the existence of any underlying feature space. Main challenge is that metric constraints (especially positive-definiteness and sub-additivity), are not automatically respected if, for example, the coefficients of the linear combination are allowed to be negative. Both positive and sub-additive constraints are linear inequalities, but the computational complexity of imposing them scales as O(D3), where D is the size of the input matrices (i.e. the size of the data set). This becomes quickly prohibitive, even when D is relatively small. We propose a new graph-based technique for optimizing under such constraints and show that, in some cases, our approach may reduce the original computational complexity of the optimization process by one order of magnitude. Contrarily to existing methods, our scheme applies to any (possibly non-convex) metric-constrained objective function.