Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. A jamming transition from under- to over-parametrization affects generalization in deep learning
 
research article

A jamming transition from under- to over-parametrization affects generalization in deep learning

Spigler, S.
•
Geiger, M.
•
d'Ascoli, S.
Show more
November 22, 2019
Journal Of Physics A-Mathematical And Theoretical

In this paper we first recall the recent result that in deep networks a phase transition, analogous to the jamming transition of granular media, delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. The analysis leading to this result support that for proper initialization and architectures, in the whole over-parametrized regime poor minima of the loss are not encountered during training, because the number of constraints that hinders the dynamics is insufficient to allow for the emergence of stable minima. Next, we study systematically how this transition affects generalization properties of the network (i.e. its predictive power). As we increase the number of parameters of a given model, starting from an under-parametrized network, we observe for gradient descent that the generalization error displays three phases: (i) initial decay, (ii) increase until the transition point?where it displays a cusp?and (iii) slow decay toward an asymptote as the network width diverges. However if early stopping is used, the cusp signaling the jamming transition disappears. Thereby we identify the region where the classical phenomenon of over-fitting takes place as the vicinity of the jamming transition, and the region where the model keeps improving with increasing the number of parameters, thus organizing previous empirical observations made in modern neural networks.

  • Details
  • Metrics
Type
research article
DOI
10.1088/1751-8121/ab4c8b
Web of Science ID

WOS:000493107800002

Author(s)
Spigler, S.
Geiger, M.
d'Ascoli, S.
Sagun, L.
Biroli, G.
Wyart, M.  
Date Issued

2019-11-22

Publisher

IOP PUBLISHING LTD

Published in
Journal Of Physics A-Mathematical And Theoretical
Volume

52

Issue

47

Article Number

474001

Subjects

Physics, Multidisciplinary

•

Physics, Mathematical

•

Physics

•

jamming

•

overparametrization

•

generalization

•

neural networks

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
PCSL  
Available on Infoscience
November 14, 2019
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/163107
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés