Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Communication trade-offs for Local-SGD with large step size
 
conference paper

Communication trade-offs for Local-SGD with large step size

Patel, Kumar Kshitij
•
Dieuleveut, Aymeric  
January 1, 2019
Advances In Neural Information Processing Systems 32 (Nips 2019)
33rd Conference on Neural Information Processing Systems (NeurIPS)

Synchronous mini-batch SGD is state-of-the-art for large-scale distributed machine learning. However, in practice, its convergence is bottlenecked by slow communication rounds between worker nodes. A natural solution to reduce communication is to use the "local-SGD" model in which the workers train their model independently and synchronize every once in a while. This algorithm improves the computation-communication trade-off but its convergence is not understood very well. We propose a non-asymptotic error analysis, which enables comparison to one-shot averaging i.e., a single communication round among independent workers, and mini-batch averagingi.e., communicating at every step. We also provide adaptive lower bounds on the communication frequency for large step-sizes (t(-alpha), alpha is an element of(1/2, 1)) and show that local-SGD reduces communication by a factor of O(root T/P-3/2), with T the total number of gradients and P machines.

  • Details
  • Metrics
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés