Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Networks
 
research article

Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Networks

Asadi, Amir R.
•
Abbe, Emmanuel  
January 1, 2020
Journal Of Machine Learning Research

We derive generalization and excess risk bounds for neural networks using a family of complexity measures based on a multilevel relative entropy. The bounds are obtained by introducing the notion of generated hierarchical coverings of neural networks and by using the technique of chaining mutual information introduced by Asadi et al. '18. The resulting bounds are algorithm-dependent and multiscale: they exploit the multilevel structure of neural networks. This, in turn, leads to an empirical risk minimization problem with a multilevel entropic regularization. The minimization problem is resolved by introducing a multiscale extension of the celebrated Gibbs posterior distribution, proving that the derived distribution achieves the unique minimum. This leads to a new training procedure for neural networks with performance guarantees, which exploits the chain rule of relative entropy rather than the chain rule of derivatives (as in backpropagation), and which takes into account the interactions between different scales of the hypothesis sets of neural networks corresponding to different depths of the hidden layers. To obtain an efficient implementation of the latter, we further develop a multilevel Metropolis algorithm simulating the multiscale Gibbs distribution, with an experiment for a two-layer neural network on the MNIST data set.

  • Details
  • Metrics
Type
research article
Web of Science ID

WOS:000558791500001

Author(s)
Asadi, Amir R.
Abbe, Emmanuel  
Date Issued

2020-01-01

Publisher

MICROTOME PUBL

Published in
Journal Of Machine Learning Research
Volume

21

Subjects

Automation & Control Systems

•

Computer Science, Artificial Intelligence

•

Computer Science

•

neural networks

•

multilevel relative entropy

•

chaining mutual information

•

multiscale generalization bound

•

multiscale gibbs distribution

•

distributions

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
MDS1  
Available on Infoscience
August 27, 2020
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/171148
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés