论文标题
神经数据服务器:用于转移学习数据的大型搜索引擎
Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data
论文作者
论文摘要
事实证明,转移学习是一种成功的技术,可以在很少有培训数据的域中训练深度学习模型。主要的方法是在大型通用数据集(例如ImageNet)上预处理模型,并在目标结构域中为其权重。但是,在数量不断增加的大量数据集的新时代,选择训练的相关数据是一个关键问题。我们介绍了神经数据服务器(NDS),这是一种大规模搜索引擎,用于将最有用的传输学习数据转移到目标域。 NDS由一个数据标准器组成,该数据标准器索引了几个大型流行的图像数据集,并旨在向客户端推荐数据,该数据具有带有自己的小标签数据集的目标应用程序的最终用户。 DataServer代表具有更紧凑的Experts模型的大型数据集,并采用它以低计算成本在一系列DataServer-Client交易中执行数据搜索。我们显示了ND在各种转移学习方案中的有效性,并在几个目标数据集和任务(例如图像分类,对象检测和实例分段)上展示了最先进的性能。神经数据服务器可在http://aidemos.cs.toronto.edu/nds/上作为Web服务服务器获得。
Transfer learning has proven to be a successful technique to train deep learning models in the domains where little training data is available. The dominant approach is to pretrain a model on a large generic dataset such as ImageNet and finetune its weights on the target domain. However, in the new era of an ever-increasing number of massive datasets, selecting the relevant data for pretraining is a critical issue. We introduce Neural Data Server (NDS), a large-scale search engine for finding the most useful transfer learning data to the target domain. NDS consists of a dataserver which indexes several large popular image datasets, and aims to recommend data to a client, an end-user with a target application with its own small labeled dataset. The dataserver represents large datasets with a much more compact mixture-of-experts model, and employs it to perform data search in a series of dataserver-client transactions at a low computational cost. We show the effectiveness of NDS in various transfer learning scenarios, demonstrating state-of-the-art performance on several target datasets and tasks such as image classification, object detection and instance segmentation. Neural Data Server is available as a web-service at http://aidemos.cs.toronto.edu/nds/.