通过知识蒸馏学习可通用的车辆路由问题模型

论文标题

通过知识蒸馏学习可通用的车辆路由问题模型

Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation

论文作者

Bi, Jieyi, Ma, Yining, Wang, Jiahai, Cao, Zhiguang, Chen, Jinbiao, Sun, Yuan, Chee, Yeow Meng

论文摘要

最近针对车辆路由问题的神经方法总是训练和测试有关相同实例分布（即均匀）的深层模型。为了解决随后的跨分布概括问题，我们将知识蒸馏带到了该领域，并提出了一种自适应的多分布知识蒸馏（AMDKD）方案，以学习更多可推广的深层模型。特别是，我们的AMDKD利用了经过典范分布培训的多个教师的各种知识，以产生轻巧但通才的学生模型。同时，我们为AMDKD配备了一种自适应策略，使学生能够专注于困难的分布，从而更有效地吸收难以掌握的知识。广泛的实验结果表明，与基线神经方法相比，我们的AMDKD能够在看不见的分发和分布外实例上获得竞争成果，这些实例是从基准数据集随机合成或从基准数据集（即TSPLIB和CVRPLIB）中随机合成或采用的。值得注意的是，我们的AMDKD是通用的，并且消耗了较少的计算资源来推理。

Recent neural methods for vehicle routing problems always train and test the deep models on the same instance distribution (i.e., uniform). To tackle the consequent cross-distribution generalization concerns, we bring the knowledge distillation to this field and propose an Adaptive Multi-Distribution Knowledge Distillation (AMDKD) scheme for learning more generalizable deep models. Particularly, our AMDKD leverages various knowledge from multiple teachers trained on exemplar distributions to yield a light-weight yet generalist student model. Meanwhile, we equip AMDKD with an adaptive strategy that allows the student to concentrate on difficult distributions, so as to absorb hard-to-master knowledge more effectively. Extensive experimental results show that, compared with the baseline neural methods, our AMDKD is able to achieve competitive results on both unseen in-distribution and out-of-distribution instances, which are either randomly synthesized or adopted from benchmark datasets (i.e., TSPLIB and CVRPLIB). Notably, our AMDKD is generic, and consumes less computational resources for inference.

下载PDF全文

下载文献需遵守相关版权规定

论文标题