论文标题
从功能中对私有图像分类进行差异性分类
Differentially Private Image Classification from Features
论文作者
论文摘要
最近已证明利用转移学习是培训具有差异隐私(DP)的大型模型的有效策略。此外,有些令人惊讶的是,最近的作品发现,私人培训只是预培训模型的最后一层为DP提供了最佳实用程序。尽管过去的研究很大程度上依赖于DP-SGD等算法来训练大型模型,但在私下从特征中学习的特定情况,我们观察到计算负担足够低,以允许更复杂的优化方案,包括二阶方法。为此,我们系统地探索了设计参数(例如损耗函数和优化算法)的影响。我们发现,虽然常用的逻辑回归在非私有设置中的性能优于线性回归,但在私人环境中,情况会逆转。我们发现,从隐私和计算方面,线性回归比逻辑回归更为有效,尤其是在更严格的Epsilon值($ε<1 $)时。在优化方面,我们还使用牛顿的方法进行了探索,并发现二阶信息即使对隐私也很有帮助,尽管利益大大减少了隐私的保证。虽然两种方法都使用二阶信息,但最小二乘在较低的Epsilons均有效,而Newton的方法在较大的Epsilon值中有效。为了结合两者的好处,我们提出了一种称为DP-FC的新型算法,该算法利用了协方差,而不是逻辑回归损失的Hessian,并且在我们尝试过的所有$ε$值中都表现良好。这样,我们在通常考虑的$ε$的所有值中,在ImagEnet-1k,CIFAR-100和CIFAR-10上获得了新的SOTA结果。最值得注意的是,在ImagEnet-1k上,我们获得了88 \%的TOP-1准确性(8,$ 8 * 10^{ - 7} $) - DP和84.3 \%以下(0.1,$ 8 * 10^{ - 7} $) - DP。
Leveraging transfer learning has recently been shown to be an effective strategy for training large models with Differential Privacy (DP). Moreover, somewhat surprisingly, recent works have found that privately training just the last layer of a pre-trained model provides the best utility with DP. While past studies largely rely on algorithms like DP-SGD for training large models, in the specific case of privately learning from features, we observe that computational burden is low enough to allow for more sophisticated optimization schemes, including second-order methods. To that end, we systematically explore the effect of design parameters such as loss function and optimization algorithm. We find that, while commonly used logistic regression performs better than linear regression in the non-private setting, the situation is reversed in the private setting. We find that linear regression is much more effective than logistic regression from both privacy and computational aspects, especially at stricter epsilon values ($ε< 1$). On the optimization side, we also explore using Newton's method, and find that second-order information is quite helpful even with privacy, although the benefit significantly diminishes with stricter privacy guarantees. While both methods use second-order information, least squares is effective at lower epsilons while Newton's method is effective at larger epsilon values. To combine the benefits of both, we propose a novel algorithm called DP-FC, which leverages feature covariance instead of the Hessian of the logistic regression loss and performs well across all $ε$ values we tried. With this, we obtain new SOTA results on ImageNet-1k, CIFAR-100 and CIFAR-10 across all values of $ε$ typically considered. Most remarkably, on ImageNet-1K, we obtain top-1 accuracy of 88\% under (8, $8 * 10^{-7}$)-DP and 84.3\% under (0.1, $8 * 10^{-7}$)-DP.