The inductive bias of deep learning: Connecting weights and functions

Ortiz Jimenez, Guillermo

doi:10.5075/epfl-thesis-9898

Publication:
The inductive bias of deep learning: Connecting weights and functions

cris.lastimport.scopus	2024-08-07T10:46:53Z
cris.lastimport.wos	2024-07-29T05:34:16Z
cris.legacyId	306613
cris.virtual.author-scopus	7005865062
cris.virtual.department	LTS4
cris.virtual.parent-organization	IEM
cris.virtual.parent-organization	STI
cris.virtual.parent-organization	EPFL
cris.virtual.parent-organization	STI
cris.virtual.parent-organization	EPFL
cris.virtual.parent-organization	EPFL
cris.virtual.parent-organization	EDOC
cris.virtual.parent-organization	ETU
cris.virtual.parent-organization	EPFL
cris.virtual.sciperId	101475
cris.virtual.sciperId	299435
cris.virtual.unitId	10851
cris.virtual.unitManager	Frossard, Pascal
cris.virtualsource.author-scopus	e8dafcb2-ee80-46b8-9351-5242e0ee0245
cris.virtualsource.author-scopus	1b724cc5-d9f9-45a3-9db3-c5da4473ee89
cris.virtualsource.department	e8dafcb2-ee80-46b8-9351-5242e0ee0245
cris.virtualsource.department	1b724cc5-d9f9-45a3-9db3-c5da4473ee89
cris.virtualsource.orcid	e8dafcb2-ee80-46b8-9351-5242e0ee0245
cris.virtualsource.orcid	1b724cc5-d9f9-45a3-9db3-c5da4473ee89
cris.virtualsource.parent-organization	fed12497-58d6-4287-ab16-dcfef0a03016
cris.virtualsource.parent-organization	fed12497-58d6-4287-ab16-dcfef0a03016
cris.virtualsource.parent-organization	fed12497-58d6-4287-ab16-dcfef0a03016
cris.virtualsource.parent-organization	fed12497-58d6-4287-ab16-dcfef0a03016
cris.virtualsource.parent-organization	b90b43a5-ca1d-4299-9a3f-53ece71373f9
cris.virtualsource.parent-organization	b90b43a5-ca1d-4299-9a3f-53ece71373f9
cris.virtualsource.parent-organization	b90b43a5-ca1d-4299-9a3f-53ece71373f9
cris.virtualsource.parent-organization	e241245b-0e63-4d9e-806e-b766e62006ef
cris.virtualsource.parent-organization	e241245b-0e63-4d9e-806e-b766e62006ef
cris.virtualsource.parent-organization	5520d273-fb5f-458c-93a9-f6a9eee8961b
cris.virtualsource.parent-organization	5520d273-fb5f-458c-93a9-f6a9eee8961b
cris.virtualsource.parent-organization	5520d273-fb5f-458c-93a9-f6a9eee8961b
cris.virtualsource.parent-organization	5520d273-fb5f-458c-93a9-f6a9eee8961b
cris.virtualsource.rid	e8dafcb2-ee80-46b8-9351-5242e0ee0245
cris.virtualsource.rid	1b724cc5-d9f9-45a3-9db3-c5da4473ee89
cris.virtualsource.sciperId	e8dafcb2-ee80-46b8-9351-5242e0ee0245
cris.virtualsource.sciperId	1b724cc5-d9f9-45a3-9db3-c5da4473ee89
cris.virtualsource.unitId	fed12497-58d6-4287-ab16-dcfef0a03016
cris.virtualsource.unitManager	fed12497-58d6-4287-ab16-dcfef0a03016
datacite.rights	openaccess
dc.contributor.advisor	Frossard, Pascal
dc.contributor.author	Ortiz Jimenez, Guillermo
dc.date.accepted	2023
dc.date.accessioned	2023-11-22T08:48:21
dc.date.available	2023-11-22T08:48:21
dc.date.created	2023-11-22
dc.date.issued	2023
dc.date.modified	2025-05-28T07:52:10.016301Z
dc.description.abstract	Years of a fierce competition have naturally selected the fittest deep learning algorithms. Yet, although these models work well in practice, we still lack a proper characterization of why they do so. This poses serious questions about the robustness, trust, and fairness of modern AI systems. This thesis aims to contribute to bridge this gap by advancing the empirical and theoretical understanding of deep learning, with a specific emphasis on understanding the intricate relationship between weight space and function space and how this shapes the inductive bias. Our investigation starts with the simplest possible learning scenario: learning linearly separable hypotheses. Despite its simplicity, our analysis reveals that most networks have a nuanced inductive bias on these tasks that depends on the direction of separability. Specifically, we show that this bias can be encapsulated in an ordered sequence of vectors, the neural anisotropy directions (NADs), which encode the preference of the network to separate the training data in a given direction. The NADs can be obtained by randomly sampling the weight space. This not only establishes a strong connection between the functional landscape and the directional bias of each architecture but also offers a new lens for examining inductive biases in deep learning. We then turn our attention to modelling the inductive bias towards a more generalized set of hypotheses. To do so, we explore the applicability of the neural tangent kernel (NTK) as an analytical tool to approximate the functional landscape. Our research shows that NTK approximations can indeed gauge the relative learning complexities across numerous tasks, even when they cannot predict absolute network performance. This approximation works best when the learned weights lie close to the initialization. This provides a nuanced understanding of the NTK's ability in capturing inductive bias, laying the groundwork for its application in our subsequent investigations. The thesis then explores two critical issues in the deep learning research. First, we scrutinize implicit neural representations (INRs) and their ability to encode rich multimedia signals. Drawing inspirations on harmonic analysis and our earlier findings, we show that the NTK's eigenfunctions act as dictionary atoms whose inner product with the target signal determines the final reconstruction performance. INRs, which use sinusoidal embeddings to encode the input, can modulate the NTK so that its eigenfunctions constitute a meaningful basis. This insight has the potential to accelerate the development of principled algorithms in INRs, offering new avenues for architectural improvements and design. Second, we offer an extensive study of the conditions required for direct model editing in the weight space. Our analysis introduces the concept of weight disentanglement as the crucial factor enabling task-specific adjustments via task arithmetic. This property emerges during pre-training and is evident when distinct weight space directions govern separate, localized input regions of the function space. Significantly, we find that linearizing models by fine-tuning them in their tangent space enhances weight disentanglement, leading to performance improvements across edition benchmarks and models. In summary, our work unveils fresh insights into the fundamental links between weight space and function space, proposing a general framework for approximating inductive.
dc.description.sponsorship	LTS4
dc.identifier.doi	10.5075/epfl-thesis-9898
dc.identifier.uri	https://infoscience.epfl.ch/handle/20.500.14299/202342
dc.language.iso	en
dc.publisher	EPFL
dc.publisher.place	Lausanne
dc.relation	https://infoscience.epfl.ch/record/306613/files/EPFL_TH9898.pdf
dc.size	188
dc.subject	Deep learning science
dc.subject	inductive bias
dc.subject	generalization
dc.subject	neural anisotropy directions
dc.subject	neural tangent kernel
dc.subject	implicit neural representations
dc.subject	model edition
dc.subject	task arithmetic
dc.subject	weight interpolations.
dc.title	The inductive bias of deep learning: Connecting weights and functions
dc.type	thesis::doctoral thesis
dspace.entity.type	Publication
dspace.file.type	n/a
dspace.legacy.oai-identifier	oai:infoscience.epfl.ch:306613
epfl.legacy.itemtype	Theses
epfl.legacy.submissionform	THESIS
epfl.oai.currentset	fulltext
epfl.oai.currentset	DOI
epfl.oai.currentset	STI
epfl.oai.currentset	thesis
epfl.oai.currentset	thesis-bn
epfl.oai.currentset	OpenAIREv4
epfl.publication.version	http://purl.org/coar/version/c_970fb48d4fbd8a85
epfl.thesis.doctoralSchool	EDEE
epfl.thesis.faculty	STI
epfl.thesis.institute	IEL
epfl.thesis.jury	Prof. Alexandre Massoud Alahi (président) ; Prof. Pascal Frossard (directeur de thèse) ; Prof. Matthieu Wyart, Prof. Fanny Yang, Dr Wieland Brendel (rapporteurs)
epfl.thesis.number	9898
epfl.thesis.originalUnit	LTS4
epfl.thesis.publicDefenseYear	2023-12-05
epfl.writtenAt	EPFL
oaire.licenseCondition	copyright

Files

Original bundle

Now showing 1 - 1 of 1

Name:: EPFL_TH9898.pdf
Size:: 14.93 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

EPFL thesis

Publication: The inductive bias of deep learning: Connecting weights and functions

Files

Original bundle

License bundle

Collections

Publication:
The inductive bias of deep learning: Connecting weights and functions