Infinite-width limit of deep linear neural networks

Chizat, Lenaic; Colombo, Maria; Fernandez-Real, Xavier; Figalli, Alessio

doi:10.1002/cpa.22200

Chizat, Lenaic; Colombo, Maria; Fernandez-Real, Xavier; Figalli, Alessio

2024

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

This paper studies the infinite-width limit of deep linear neural networks (NNs) initialized with random parameters. We obtain that, when the number of parameters diverges, the training dynamics converge (in a precise sense) to the dynamics obtained from a gradient descent on an infinitely wide deterministic linear NN. Moreover, even if the weights remain random, we get their precise law along the training dynamics, and prove a quantitative convergence result of the linear predictor in terms of the number of parameters. We finally study the continuous-time limit obtained for infinitely wide linear NNs and show that the linear predictors of the NN converge at an exponential rate to the minimal & ell;2$\ell _2$-norm minimizer of the risk.

Details

Title Infinite-width limit of deep linear neural networks

Author(s) Chizat, Lenaic ; Colombo, Maria ; Fernandez-Real, Xavier ; Figalli, Alessio

Published in Communications On Pure And Applied Mathematics

Date 2024-05-06

Publisher Wiley, Hoboken

ISSN 0010-3640
1097-0312

DOI https://doi.org/10.1002/cpa.22200

Other identifier(s) View record in Web of Science

Laboratories AMCV

Record Appears in Scientific production and competences > SB - School of Basic Sciences > MATH - Institute of Mathematics > AMCV - Chair of Mathematical Analysis, Calculus of Variations and PDEs
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Grant SNF
Swiss State Secretariat for Education Research and Innovation (SERI): MB22.00034
European Research Council (ERC): 721675
Lagrange Mathematics and Computation Research Center
PZ00P2_208930
PID2021-125021NA-I00

Record creation date 2024-05-16

Files

Abstract

Details

PDF