论文标题

私人kolmogorov-smirnov型测试

Differentially Private Kolmogorov-Smirnov-Type Tests

论文作者

Awan, Jordan, Wang, Yue

论文摘要

假设检验是统计分析中的一个核心问题,目前缺乏统计上有效和强大的差异私有测试。在本文中,我们开发了几种新的差异化私有(DP)非参数假设检验。我们的测试基于Kolmogorov-Smirnov,Kuiper,Cramér-von Mises和Wasserstein测试统计数据,这些统计量都可以表示为对经验累积分布功能(ECDFS)的伪单表,并且可以用于对效果良好的效果,两个样品和配对数据进行测试。我们表明,这些测试统计量具有低灵敏度,需要最小的噪声才能满足DP。特别是,我们表明这些测试统计量的灵敏度可以用基本灵敏度表示,这是相邻数据库的ECDF之间的伪金属距离,并且很容易计算。我们的测试统计数据的采样分布在零假设下是无分布的,可以轻松计算蒙特卡洛方法对$ p $值的计算。我们表明,在几种情况下,尤其是在较小的隐私预算或重型数据的情况下,我们的新DP测试优于替代性非参数DP测试。

Hypothesis testing is a central problem in statistical analysis, and there is currently a lack of differentially private tests which are both statistically valid and powerful. In this paper, we develop several new differentially private (DP) nonparametric hypothesis tests. Our tests are based on Kolmogorov-Smirnov, Kuiper, Cramér-von Mises, and Wasserstein test statistics, which can all be expressed as a pseudo-metric on empirical cumulative distribution functions (ecdfs), and can be used to test hypotheses on goodness-of-fit, two samples, and paired data. We show that these test statistics have low sensitivity, requiring minimal noise to satisfy DP. In particular, we show that the sensitivity of these test statistics can be expressed in terms of the base sensitivity, which is the pseudo-metric distance between the ecdfs of adjacent databases and is easily calculated. The sampling distribution of our test statistics are distribution-free under the null hypothesis, enabling easy computation of $p$-values by Monte Carlo methods. We show that in several settings, especially with small privacy budgets or heavy-tailed data, our new DP tests outperform alternative nonparametric DP tests.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源