On the symmetries in the dynamics of wide two-layer neural networks

Hajjar, Karl; Chizat, Lenaic

doi:10.3934/era.2023112

research article

On the symmetries in the dynamics of wide two-layer neural networks

Hajjar, Karl

•

Chizat, Lenaic

January 1, 2023

Electronic Research Archive

We consider the idealized setting of gradient flow on the population risk for infinitely wide two-layer ReLU neural networks (without bias), and study the effect of symmetries on the learned parameters and predictors. We first describe a general class of symmetries which, when satisfied by the target function f* and the input distribution, are preserved by the dynamics. We then study more specific cases. When f* is odd, we show that the dynamics of the predictor reduces to that of a (non -linearly parameterized) linear predictor, and its exponential convergence can be guaranteed. When f* has a low-dimensional structure, we prove that the gradient flow PDE reduces to a lower-dimensional PDE. Furthermore, we present informal and numerical arguments that suggest that the input neurons align with the lower-dimensional structure of the problem.

Type

research article

DOI

10.3934/era.2023112

Web of Science ID

WOS:000936495300001

Authors

Hajjar, Karl

•

Chizat, Lenaic

Publication date

2023-01-01

Publisher

AMER INST MATHEMATICAL SCIENCES-AIMS

Published in

Electronic Research Archive

Volume

31

Issue

4

Start page

2175

End page

2212

Subjects

Mathematics

neural networks

gradient descent

infinite -width limit...

representation learni...

Peer reviewed

REVIEWED

EPFL units

DOLA

Available on Infoscience

March 27, 2023

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/196480