Polynomial Escape-Time From Saddle Points In Distributed Non-Convex Optimization
The diffusion strategy for distributed learning from streaming data employs local stochastic gradient updates along with exchange of iterates over neighborhoods. In this work we establish that agents cluster around a network centroid in the mean-fourth sense and proceeded to study the dynamics of this point. We establish expected descent in non-convex environments in the large-gradient regime and introduce a short-term model to examine the dynamics over finitetime horizons. Using this model, we establish that the diffusion strategy is able to escape from strict saddle-points in O(1/mu) iterations, where mu denotes the step-size; it is also able to return approximately second-order stationary points in a polynomial number of iterations. Relative to prior works on the polynomial escape from saddle-points, most of which focus on centralized perturbed or stochastic gradient descent, our approach requires less restrictive conditions on the gradient noise process.
WOS:000556233000033
2019-01-01
New York
978-1-7281-5549-4
171
175
REVIEWED
EPFL
Event name | Event place | Event date |
Guadeloupe, FRANCE | Dec 15-18, 2019 | |