Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Optimal errors and phase transitions in high-dimensional generalized linear models
 
research article

Optimal errors and phase transitions in high-dimensional generalized linear models

Barbier, Jean  
•
Krzakala, Florent
•
Macris, Nicolas  
Show more
March 19, 2019
Proceedings Of The National Academy Of Sciences Of The United States Of America (PNAS)

Generalized linear models (GLMs) are used in high-dimensional machine learning, statistics, communications, and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes, or benchmark models in neural networks. We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed. Nonrigorous predictions for the optimal errors existed for special cases of GLMs, e.g., for the perceptron, in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades-old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance and locate the associated sharp phase transitions separating learnable and nonlearnable regions. We believe that this random version of GLMs can serve as a challenging benchmark for multipurpose algorithms.

  • Files
  • Details
  • Metrics
Type
research article
DOI
10.1073/pnas.1802705116
Web of Science ID

WOS:000461679000043

Author(s)
Barbier, Jean  
Krzakala, Florent
Macris, Nicolas  
Miolane, Leo
Zdeborova, Lenka
Date Issued

2019-03-19

Publisher

National Academy of Sciences

Published in
Proceedings Of The National Academy Of Sciences Of The United States Of America (PNAS)
Volume

116

Issue

12

Start page

5451

End page

5460

Subjects

Multidisciplinary Sciences

•

Science & Technology - Other Topics

•

high-dimensional inference

•

generalized linear model

•

bayesian inference

•

perceptron

•

approximate message-passing algorithm

•

message-passing algorithms

•

statistical-mechanics

•

storage capacity

•

signal recovery

•

neural-networks

•

information

•

retrieval

•

equations

•

cdma

Note

This is an open access article under the terms of the Creative Commons Attribution License

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LTHC  
Available on Infoscience
June 18, 2019
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/158044
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés