Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Stochastic distributed learning with gradient quantization and double-variance reduction
 
research article

Stochastic distributed learning with gradient quantization and double-variance reduction

Horvath, Samuel
•
Kovalev, Dmitry
•
Mishchenko, Konstantin
Show more
September 24, 2022
Optimization Methods & Software

We consider distributed optimization over several devices, each sending incremental model updates to a central server. This setting is considered, for instance, in federated learning. Various schemes have been designed to compress the model updates in order to reduce the overall communication cost. However, existing methods suffer from a significant slowdown due to additional variance omega > 0 coming from the compression operator and as a result, only converge sublinearly. What is needed is a variance reduction technique for taming the variance introduced by compression. We propose the first methods that achieve linear convergence for arbitrary compression operators. For strongly convex functions with condition number kappa, distributed among n machines with a finite-sum structure, each worker having less than in components, we also (i) give analysis for the weakly convex and the non-convex cases and (ii) verify in experiments that our novel variance reduced schemes are more efficient than the baselines. Moreover, we show theoretically that as the number of devices increases, higher compression levels are possible without this affecting the overall number of communications in comparison with methods that do not perform any compression. This leads to a significant reduction in communication cost. Our general analysis allows to pick the most suitable compression for each problem, finding the rig ht balance between additional variance and communication savings. Finally, we also (iii) give analysis for arbitrary quantized updates.

  • Details
  • Metrics
Type
research article
DOI
10.1080/10556788.2022.2117355
Web of Science ID

WOS:000860671700001

Author(s)
Horvath, Samuel
Kovalev, Dmitry
Mishchenko, Konstantin
Richtarik, Peter
Stich, Sebastian  
Date Issued

2022-09-24

Publisher

TAYLOR & FRANCIS LTD

Published in
Optimization Methods & Software
Subjects

Computer Science, Software Engineering

•

Operations Research & Management Science

•

Mathematics, Applied

•

Computer Science

•

Mathematics

•

distributed optimization

•

federated learning

•

stochastic optimization

•

communication compression

•

variance reduction

•

gradient methods

•

ascent

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
MLO  
Available on Infoscience
October 24, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/191589
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés