Penalising the biases in norm regularisation enforces sparsity

Boursier, Etienne; Flammarion, Nicolas

2023

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Controlling the parameters' norm often yields good generalisation when training neural networks. Beyond simple intuitions, the relation between parameters' norm and obtained estimators theoretically remains misunderstood. For one hidden ReLU layer networks with unidimensional data, this work shows the minimal parameters' norm required to represent a function is given by the total variation of its second derivative, weighted by a $\sqrt{1+x^2}$ factor. As a comparison, this $\sqrt{1+x^2}$ weighting disappears when the norm of the bias terms are ignored. This additional weighting is of crucial importance, since it is shown in this work to enforce uniqueness and sparsity (in number of kinks) of the minimal norm interpolator. On the other hand, omitting the bias' norm allows for non-sparse solutions. Penalising the bias terms in the regularisation, either explicitly or implicitly, thus leads to sparse estimators. This sparsity might take part in the good generalisation of neural networks that is empirically observed.

Details

Title Penalising the biases in norm regularisation enforces sparsity

Author(s) Boursier, Etienne ; Flammarion, Nicolas

Pagination 32

Date 2023-03-03

Laboratories TML

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > TML - Theory of Machine Learning Laboratory
Working papers
Work produced at EPFL
Submitted

Record creation date 2023-03-03

Files

Abstract

Details

Actions