Learning with Compressible Priors

We describe a set of probability distributions, dubbed compressible priors, whose independent and identically distributed (iid) realizations result in p-compressible signals. A signal x in R^N is called p-compressible with magnitude R if its sorted coefficients exhibit a power-law decay as |x|_(i) <= R i^d, where the decay rate d is equal to 1/p. p-compressible signals live close to K-sparse signals (K<<N) in the ell_r-norm (r > p) since their best K-sparse approximation error decreases with O(R K^{1/r-1/p}). We show that the membership of generalized Pareto, Student’s t, log-normal, Frechet, and log-logistic distributions to the set of compressible priors depends only on the distribution parameters and is independent of N. In contrast, we demonstrate that the membership of the generalized Gaussian distribution (GGD) depends both on the signal dimension and the GGD parameters: the expected decay rate of N-sample iid realizations from the GGD with the shape parameter q is given by 1/[q log (N/q)]. As stylized examples, we show via experiments that the wavelet coefficients of natural images are 1.67-compressible whereas their pixel gradients are 0.95 log (N/0.95)-compressible, on the average. We also leverage the connections between compressible priors and sparse signals to develop new iterative re-weighted sparse signal recovery algorithms that outperform the standard ell_1-norm minimization. Finally, we describe how to learn the hyperparameters of compressible priors in underdetermined regression problems.

Presented at:
Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, December 2009

 Record created 2010-09-07, last modified 2018-03-17

Learning with Compressible Priors - Download fulltextPDF
randcs - Download fulltextZIP
Rate this document:

Rate this document:
(Not yet reviewed)