Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Masked Training of Neural Networks with Partial Gradients
 
conference paper

Masked Training of Neural Networks with Partial Gradients

Mohtashami, Amirkeivan  
•
Jaggi, Martin  
•
Stich, Sebastian U.
January 1, 2022
International Conference On Artificial Intelligence And Statistics, Vol 151
International Conference on Artificial Intelligence and Statistics

State-of-the-art training algorithms for deep learning models are based on stochastic gradient descent (SGD). Recently, many variations have been explored: perturbing parameters for better accuracy (such as in Extra-gradient), limiting SGD updates to a subset of parameters for increased efficiency (such as meProp) or a combination of both (such as Dropout). However, the convergence of these methods is often not studied in theory.

We propose a unified theoretical framework to study such SGD variants-encompassing the aforementioned algorithms and additionally a broad variety of methods used for communication efficient training or model compression. Our insights can be used as a guide to improve the efficiency of such methods and facilitate generalization to new applications. As an example, we tackle the task of jointly training networks, a version of which (limited to sub-networks) is used to create Slimmable Networks. By training a low-rank Transformer jointly with a standard one we obtain superior performance than when it is trained separately.

  • Details
  • Metrics
Type
conference paper
Web of Science ID

WOS:000841852300012

Author(s)
Mohtashami, Amirkeivan  
Jaggi, Martin  
Stich, Sebastian U.
Date Issued

2022-01-01

Publisher

JMLR-JOURNAL MACHINE LEARNING RESEARCH

Publisher place

San Diego

Published in
International Conference On Artificial Intelligence And Statistics, Vol 151
Series title/Series vol.

Proceedings of Machine Learning Research

Volume

151

Start page

5876

End page

5890

Subjects

Computer Science, Artificial Intelligence

•

Statistics & Probability

•

Computer Science

•

Mathematics

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
MLO  
Event nameEvent placeEvent date
International Conference on Artificial Intelligence and Statistics

ELECTR NETWORK

Mar 28-30, 2022

Available on Infoscience
November 7, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/191911
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés