Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Topics in statistical physics of high-dimensional machine learning
 
doctoral thesis

Topics in statistical physics of high-dimensional machine learning

Cui, Hugo Chao  
2024

In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in the use of artificial neural networks to learn informative and actionable representations of data, from data. While the number of empirical successes accrues, a solid theoretical comprehension of the unreasonable effectiveness of ML methods in learning from high-dimensional data still proves largely elusive. This is the question addressed in this thesis, through the study of solvable models in high dimensions, satisfying the dual requirement of (a) capturing the key features of practical ML tasks while (b) remaining amenable to mathematical analysis. Borrowing ideas from statistical physics, this thesis presents sharp asymptotic incursions into a selection of central aspects of modern ML.

The remarkable versatility of ML models lies in their ability to extract informative features from data. The first part of the thesis delves into analyzing which structural characteristics of these features condition the learning of ML methods. Specifically, it highlights how, in several settings, a theory formulated in terms of two statistical descriptors can tightly capture the learning curves of simple real tasks. For kernel methods in particular, this insight enables one to relate the error scaling laws to the structure of the features.

The second part then refines the focus to study which features are extracted in multi-layer neural networks, both (a) when untrained and (b) when trained in the framework of Bayesian learning, or after one large gradient step. In particular, it delineates cases in which Gaussian universality holds and limits the network expressivity, and cases in which neural networks succeed in learning non-trivial features.

Finally, supervised learning tasks with fully-connected architectures constitute but a small part of the zoology of modern ML tasks. The last part of the thesis opens up the sharp asymptotic explorations to some modern aspects of the discipline, in particular transport-based generative models, and dot-product attention mechanisms.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-10948
Author(s)
Cui, Hugo Chao  
Advisors
Zdeborová, Lenka  
Jury

Prof. Laurent Villard (président) ; Prof. Lenka Zdeborová (directeur de thèse) ; Prof. Nicolas Flammarion, Prof. Giulio Biroli, Prof. Joan Bruna (rapporteurs)

Date Issued

2024

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2024-06-24

Thesis number

10948

Total of pages

284

Subjects

Machine Learning

•

Statistical Physics

•

High-dimensional asymptotics

•

Deep Neural Networks

•

Random Features

•

Gaussian Universality

•

Kernels

•

Attention mechanisms

•

Generative models

EPFL units
SPOC1  
Faculty
SB  
School
IPHYS  
Doctoral School
EDPY  
Available on Infoscience
June 19, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/208803
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés