Generalization of Noisy SGD in Unbounded Non-convex Settings
We study the generalization of iterative noisy gradient schemes on smooth non-convex losses. Formally, we establish time-independent information theoretic generalization bounds for Stochastic Gradient Langevin Dynamics (SGLD) that do not diverge as the iteration count increases. Our bounds are obtained through a stability argument: we analyze the difference between two SGLD sequences ran in parallel on two datasets sampled from the same distribution. Our result only requires an isoperimetric inequality to hold, which is merely a restriction on the tails of the loss. We relax the assumptions of prior work to establish that the iterates stay within a bounded KL divergence from each other. Under an additional dissipativity assumption, we show that the stronger Renyi divergence also stays bounded by establishing a uniform log-Sobolev constant of the iterates. Without dissipativity, we sidestep the need for local log-Sobolev inequalities and instead exploit the regularizing properties of Gaussian convolution. These techniques allow us to show that strong convexity is not necessary for finite stability bounds and thus for finite generalization and differential privacy bounds.
dadiack.pdf
Main Document
Accepted version
openaccess
N/A
411.98 KB
Adobe PDF
7cb52bf4c5197cee501052e02ca67d93