论文标题
稀有根:生成稀有类的样品
RareGAN: Generating Samples for Rare Classes
论文作者
论文摘要
我们研究了学习生成性对抗网络(GAN)的问题,以限制预算的罕见类别的未标记数据集。此问题是由于在域中的实际应用中的动机,包括安全性(例如,用于DNS放大攻击的综合数据包),系统和网络(例如,触发高资源使用情况的合成工作负载)和机器学习(例如,从罕见类中生成图像)。现有方法是不合适的,要么需要完全标记的数据集,要么牺牲罕见类别的忠诚度。我们提出了Raregan,这是三个关键思想的新型综合:(1)扩展条件剂量以使用标记和未标记的数据进行更好的概括; (2)一种主动学习方法,要求最有用的标签; (3)加权损失功能,有利于学习稀有类别。我们表明,稀有基金在稀有阶级取得了比在不同应用,预算,稀有类别的分数,gan损失和建筑范围内的稀有阶级折衷。
We study the problem of learning generative adversarial networks (GANs) for a rare class of an unlabeled dataset subject to a labeling budget. This problem is motivated from practical applications in domains including security (e.g., synthesizing packets for DNS amplification attacks), systems and networking (e.g., synthesizing workloads that trigger high resource usage), and machine learning (e.g., generating images from a rare class). Existing approaches are unsuitable, either requiring fully-labeled datasets or sacrificing the fidelity of the rare class for that of the common classes. We propose RareGAN, a novel synthesis of three key ideas: (1) extending conditional GANs to use labelled and unlabelled data for better generalization; (2) an active learning approach that requests the most useful labels; and (3) a weighted loss function to favor learning the rare class. We show that RareGAN achieves a better fidelity-diversity tradeoff on the rare class than prior work across different applications, budgets, rare class fractions, GAN losses, and architectures.