Variance-Reduced Stochastic Learning Under Random Reshuffling
Several useful variance-reduced stochastic gradient algorithms, such as SVRG, SAGA, Finito, and SAG, have been proposed to minimize empirical risks with linear convergence properties to the exact minimizer. The existing convergence results assume uniform data sampling with replacement. However, it has been observed in related works that random reshuffling can deliver superior performance over uniform sampling and, yet, no formal proofs or guarantees of exact convergence exist for variance-reduced algorithms under random reshuffling. This paper makes two contributions. First, it provides a theoretical guarantee of linear convergence under random reshuffling for SAGA in the mean-square sense; the argument is also adaptable to other variance-reduced algorithms. Second, under random reshuffling, the article proposes a new amortized variance-reduced gradient (AVRG) algorithm with constant storage requirements compared to SAGA and with balanced gradient computations compared to SVRG. AVRG is also shown analytically to converge linearly.
WOS:000519834600002
2020-01-01
68
1390
1408
REVIEWED