An Initial Alignment between Neural Network and Target is Needed for Gradient Descent to Learn
This paper introduces the notion of "Initial Alignment" (INAL) between a neural network at initialization and a target function. It is proved that if a network and a Boolean target function do not have a noticeable INAL, then noisy gradient descent on a fully connected network with normalized i.i.d. initialization will not learn in polynomial time. Thus a certain amount of knowledge about the target (measured by the INAL) is needed in the architecture design. This also provides an answer to an open problem posed in (Abbe & Sandon, 2020a). The results are based on deriving lower bounds for descent algorithms on symmetric neural networks without explicit knowledge of the target function beyond its INAL.
WOS:000899944900003
2022-01-01
San Diego
Proceedings of Machine Learning Research
33
52
REVIEWED
EPFL
| Event name | Event place | Event date |
Baltimore, MD | Jul 17-23, 2022 | |