Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances
 
conference paper

Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

Şimşek, Berfin
•
Ged, François
•
Jacot, Arthur
Show more
2021
Proceedings of the 38th International Conference on Machine Learning
38 th International Conference on Machine Learning (ICML 2021)

We study how permutation symmetries in overparameterized multi-layer neural networks generate `symmetry-induced' critical points. Assuming a network with $ L $ layers of minimal widths $ r_1^, \ldots, r_{L-1}^ $ reaches a zero-loss minimum at $ r_1^! \cdots r_{L-1}^! $ isolated points that are permutations of one another, we show that adding one extra neuron to each layer is sufficient to connect all these previously discrete minima into a single manifold. For a two-layer overparameterized network of width $ r^+ h =: m $ we explicitly describe the manifold of global minima: it consists of $ T(r^, m) $ affine subspaces of dimension at least $ h $ that are connected to one another. For a network of width $m$, we identify the number $G(r,m)$ of affine subspaces containing only symmetry-induced critical points that are related to the critical points of a smaller network of width $r<r^$. Via a combinatorial analysis, we derive closed-form formulas for $ T $ and $ G $ and show that the number of symmetry-induced critical subspaces dominates the number of affine subspaces forming the global minima manifold in the mildly overparameterized regime (small $ h $) and vice versa in the vastly overparameterized regime ($h \gg r^$). Our results provide new insights into the minimization of the non-convex loss function of overparameterized neural networks.

  • Details
  • Metrics
Type
conference paper
Web of Science ID

WOS:000768182705080

ArXiv ID

2105.12221v2

Author(s)
Şimşek, Berfin
•
Ged, François
•
Jacot, Arthur
•
Spadaro, Francesco
•
Hongler, Clément
•
Gerstner, Wulfram  
•
Brea, Johanni  
Date Issued

2021

Published in
Proceedings of the 38th International Conference on Machine Learning
Total of pages

29

Series title/Series vol.

Proceedings of Machine Learning Research; 139

Volume

139

Start page

9722

End page

9732

Subjects

ml-ai

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LCN  
Event nameEvent placeEvent date
38 th International Conference on Machine Learning (ICML 2021)

Virtual

July 18-24, 2021

Available on Infoscience
January 17, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/184607
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés