Universality laws for Gaussian mixtures in generalized linear models
A recent line of work in high-dimensional statistics working under the Gaussian mixture hypothesis has led to a number of results in the context of empirical risk minimization, Bayesian uncertainty quantification, separation of kernel methods and neural networks, ensembling and fluctuation of random features. We provide rigorous proofs for the applicability of these results to a general class of datasets (xi,yi,i=1,…,n) containing independent samples from a mixture distribution ∑c∈CρcPcx. Specifically, we consider the hypothesis class of generalized linear models y^=F(Θ⊤x) and investigate the asymptotic joint statistics of a family of generalized linear estimators (Θ(1),…,Θ(M)), obtained either from (a) minimizing an empirical risk Rn^(m)(Θ(m);X,y) or (b) sampling from the associated Gibbs measure exp(−βnRn^(m)(Θ(m);X,y)). Our main contribution is to characterize under which conditions the asymptotic joint statistics of this family depends (on a weak sense) only on the means and covariances of the class conditional features distribution Pcx. This allows us to prove the universality of different quantities of interest, including training, generalization errors, as well as the geometrical properties and correlations of the estimators.
NeurIPS-2023-universality-laws-for-gaussian-mixtures-in-generalized-linear-models-Paper-Conference.pdf
Main Document
Not Applicable (or Unknown)
openaccess
N/A
532.62 KB
Adobe PDF
9b5e60481006711d53c2e7994bb71e1c
13094_universality_laws_for_gaussian-Supplementary Material.pdf
Supplementary Material/information
Not Applicable (or Unknown)
openaccess
N/A
504.59 KB
Adobe PDF
862167d81c33f80fa6942e51f645b976