论文标题
与私有化数据非常普遍一致的非参数回归和分类
Strongly universally consistent nonparametric regression and classification with privatised data
论文作者
论文摘要
在本文中,我们重新审视了非参数回归的经典问题,但施加了当地的差异隐私约束。在此类约束下,原始数据$(x_1,y_1),\ ldots,(x_n,y_n)$,在$ \ mathbb {r}^d \ times \ times \ mathbb {r} $中取值,无法直接观察到,并且所有估计器都是从适当的隐私机构中的随机输出的函数。统计学家可以自由选择隐私机制的形式,在这里,我们将拉普拉斯分布式噪声添加到特征向量$ x_i $的位置的离散化以及其响应变量$ y_i $的价值。基于这些随机数据,我们设计了回归函数的新颖估计器,可以将其视为经过精心研究的分区回归估计器的私有化版本。主要的结果是估计量具有很强的普遍一致性。我们的方法和分析还产生了针对本地私人数据的强烈普遍一致的二进制分类规则。
In this paper we revisit the classical problem of nonparametric regression, but impose local differential privacy constraints. Under such constraints, the raw data $(X_1,Y_1),\ldots,(X_n,Y_n)$, taking values in $\mathbb{R}^d \times \mathbb{R}$, cannot be directly observed, and all estimators are functions of the randomised output from a suitable privacy mechanism. The statistician is free to choose the form of the privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of a feature vector $X_i$ and to the value of its response variable $Y_i$. Based on this randomised data, we design a novel estimator of the regression function, which can be viewed as a privatised version of the well-studied partitioning regression estimator. The main result is that the estimator is strongly universally consistent. Our methods and analysis also give rise to a strongly universally consistent binary classification rule for locally differentially private data.