Asynchronous updates for stochastic gradient descent
Chiappa
Alberto Silvio
2017
Finding convergence rates for numerical optimization algorithms is an important task, because it gives a justification to their use in solving practical problems, while also providing a way to compare their efficiency. This is especially useful in an asynchronous environment, because the algorithms are often proven to be more efficient than their synchronous counterparts by experience, but they lack the theory that justifies this property. Furthermore, analyzing the various issues that can arise when inconsistency is taken into consideration, it is possible to obtain a clearer picture of the downsides of inexact implementations one should be aware of. This work tries to address the problem of finding an expected convergence rate for the asynchronous version of the widely popular stochastic gradient descent algorithm, applied to the common class of problems that present a cost function with a sum structure. It follows a similar approach to the one suggested by R. Leblond, F. Pedregosa and S. Lacoste-Julien in "ASAGA: Asynchronous Parallel SAGA" (2016) [RLLJ16], also borrowing their formalization of asynchronicity. The main achievement of this work is a bound on the constant step size that guarantees convergence in expectation of the algorithm. The relative convergence rate is also obtained. The result is also partially validated by sequential models of an asynchronous environment. We hope that this can be a basis for future applications of the same approach to more specific algorithms and that numerical experiments on real multiprocessor architecture can be performed in the future to further validate the convergence rates.
STUDENT