Locally differentially-private distribution estimation
We consider a setup in which confidential i.i.d. samples X1, ..., Xn from an unknown discrete distribution PX are passed through a discrete memoryless privatization channel (a.k.a. mechanism) which guarantees an epsilon-level of local differential privacy. For a given epsilon, the channel should be designed such that an estimate of the source distribution based on the channel outputs converges as fast as possible to the exact value PX. For this purpose we consider two metrics of estimation accuracy: the expected mean-square error and the expected Kullback-Leibler divergence. We derive their respective normalized first-order terms (as n tends to infinity), which for a given target privacy epsilon represent the factor by which the sample size must be augmented so as to achieve the same estimation accuracy as that of an identity (non-privatizing) channel. We formulate the privacy-utility tradeoff problem as being that of minimizing said first-order term under a privacy constraint epsilon. A converse bound is stated which bounds the optimal tradeoff away from the origin. Inspired by recent work on the optimality of staircase mechanisms (albeit for objectives different from ours), we derive an achievable tradeoff based on circulant step mechanisms. Within this finite class, we determine the optimal step pattern.