组合问题中输出空间不变性的神经模型

论文标题

组合问题中输出空间不变性的神经模型

Neural Models for Output-Space Invariance in Combinatorial Problems

论文作者

Nandwani, Yatin, Jain, Vidit, Mausam, Singla, Parag

论文摘要

最近，已经提出了许多神经模型来通过使用其求解实例（例如Sudoku或Graph Coloring（GCP））隐式学习潜在的约束来解决组合拼图。通常基于图形神经网络（GNN）的拟议体系结构的一个缺点是，它们无法跨越分配变量的输出空间的大小，例如，GCP中的一组颜色或sudoku中的板尺寸。我们将变量的输出空间称为“值集”。尽管许多作品已经证明了GNN跨图尺寸的概括，但尚未研究如何设计GNN来实现来自同一领域的问题的价值设置不变性。例如，在仅接受9 x 9 sudokus培训后，学习解决16 x 16 sudoku。在这项工作中，我们提出了新的方法来扩展基于GNN的架构以实现价值设置的不变性。具体而言，我们的模型建立在最近提出的反复关系网络的基础上。我们的第一种方法通过将多类节点分类问题转换为二进制节点分类问题，从而利用了GNN的图形大小不变性。我们的第二种方法通过添加与值集中值相对应的多个节点直接与多个类一起工作，然后将变量节点连接到值的节点，具体取决于问题初始化。我们对三种不同组合问题的实验评估表明，与通用的神经推理器相比，我们的两个模型在新的问题上都表现良好。在我们的两个模型之间，我们观察到一个固有的权衡：虽然在较小的价值集进行训练时，二进制模型可提供更好的性能，但多价值模型的内存效率要高得多，在对较大的价值集进行培训时，二进制模型未能训练时会提高性能。

Recently many neural models have been proposed to solve combinatorial puzzles by implicitly learning underlying constraints using their solved instances, such as sudoku or graph coloring (GCP). One drawback of the proposed architectures, which are often based on Graph Neural Networks (GNN), is that they cannot generalize across the size of the output space from which variables are assigned a value, for example, set of colors in a GCP, or board-size in sudoku. We call the output space for the variables as 'value-set'. While many works have demonstrated generalization of GNNs across graph size, there has been no study on how to design a GNN for achieving value-set invariance for problems that come from the same domain. For example, learning to solve 16 x 16 sudoku after being trained on only 9 x 9 sudokus. In this work, we propose novel methods to extend GNN based architectures to achieve value-set invariance. Specifically, our model builds on recently proposed Recurrent Relational Networks. Our first approach exploits the graph-size invariance of GNNs by converting a multi-class node classification problem into a binary node classification problem. Our second approach works directly with multiple classes by adding multiple nodes corresponding to the values in the value-set, and then connecting variable nodes to value nodes depending on the problem initialization. Our experimental evaluation on three different combinatorial problems demonstrates that both our models perform well on our novel problem, compared to a generic neural reasoner. Between two of our models, we observe an inherent trade-off: while the binarized model gives better performance when trained on smaller value-sets, multi-valued model is much more memory efficient, resulting in improved performance when trained on larger value-sets, where binarized model fails to train.

下载PDF全文

下载文献需遵守相关版权规定

论文标题