Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression

Clarte, Lucas; Vandenbroucque, Adrien; Dalle, Guillaume; Loureiro, Bruno; Krzakala, Florent; Zdeborová, Lenka

conference paper

Clarte, Lucas

•

Vandenbroucque, Adrien

•

Dalle, Guillaume

April 26, 2024

UAI '24: Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence

40th Conference on Uncertainty in Artificial Intelligence

We investigate popular resampling methods for estimating the uncertainty of statistical models, such as subsampling, bootstrap and the jackknife, and their performance in high-dimensional supervised regression tasks. We provide a tight asymptotic description of the biases and variances estimated by these methods in the context of generalized linear models, such as ridge and logistic regression, taking the limit where the number of samples and dimension of the covariates grow at a comparable fixed rate. Our findings are threefold: i) resampling methods are fraught with problems in high dimensions and exhibit the double-descent-like behavior typical of these situations; ii) only when the sampling ratio is large enough do they provide consistent and reliable error estimations (we give convergence rates); iii) in the over-parametrized regime relevant to modern machine learning practice, their predictions are not consistent, even with optimal regularization.

Name

361_Analysis_of_Bootstrap_and_.pdf

Type

Main Document

Version

Submitted version (Preprint)

Access type

openaccess

License Condition

CC BY

Size

578.86 KB

Format

Adobe PDF

Checksum (MD5)

fae4dc45a388137aa4be57720a9e6fb1