Stochastic gradient with least-squares control variates

Nobile, FabioRaviola, MatteoSchaeffer, Nathan2025-08-182025-08-182025-08-182025-07-22https://infoscience.epfl.ch/handle/20.500.14299/252927https://arxiv.org/abs/2507.20981v1The stochastic gradient descent (SGD) method is a widely used approach for solving stochastic optimization problems, but its convergence is typically slow. Existing variance reduction techniques, such as SAGA, improve convergence by leveraging stored gradient information; however, they are restricted to settings where the objective functional is a finite sum, and their performance degrades when the number of terms in the sum is large. In this work, we propose a novel approach which is well suited when the objective is given by an expectation over random variables with a continuous probability distribution. Our method constructs a control variate by fitting a linear model to past gradient evaluations using weighted discrete least-squares, effectively reducing variance while preserving computational efficiency. We establish theoretical sublinear convergence guarantees and demonstrate the method's effectiveness through numerical experiments on random PDE-constrained optimization problems.enStochastic gradient with least-squares control variatestext::preprint