Convergence Of Variance-Reduced Learning Under Random Reshuffling
Several useful variance-reduced stochastic gradient algorithms, such as SVRG, SAGA, Finito, and SAG, have been proposed to minimize empirical risks with linear convergence properties to the exact minimizers. The existing convergence results assume uniform data sampling with replacement. However, it has been observed that random reshuffling can deliver superior performance and, yet, no formal proofs or guarantees of exact convergence exist for variance-reduced algorithms under random reshuffling. This paper makes two contributions. First, it resolves this open issue and provides the first theoretical guarantee of linear convergence under random reshuffling for SAGA; the argument is also adaptable to other variance-reduced algorithms. Second, under random reshuffling, the paper proposes a new amortized variance-reduced gradient (AVRG) algorithm with constant storage requirements compared to SAGA and with balanced gradient computations compared to SVRG. AVRG is also shown analytically to converge linearly.
WOS:000446384602093
2018-01-01
978-1-5386-4658-8
New York
2286
2290
REVIEWED
Event name | Event place | Event date |
Calgary, CANADA | Apr 15-20, 2018 | |