Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Distributional Regression and Autoregression via Optimal Transport
 
doctoral thesis

Distributional Regression and Autoregression via Optimal Transport

Ghodrati, Laya  
2023

We present a framework for performing regression when both covariate and response are probability distributions on a compact and convex subset of $\R^d$. Our regression model is based on the theory of optimal transport and links the conditional Fr'echet mean of the response to the covariate via an optimal transport map. We define a Fr'echet-least-squares estimator of this regression map, and establish its consistency and rate of convergence to the true map under full observation of the regression pairs.

For the specific case when $d=1$, we obtain additional results: we establish the minimax rate of estimation of such a regression function, by deriving a lower bound that matches the convergence rate attained by the Fr'echet least squares estimator. Additionally, we find an upper-bound for the convergence rate of an estimator when observing only samples from the covariate and response distributions. Also in this case, the computation of the estimator is shown to reduce to a standard convex optimisation problem, and thus our regression model can be implemented with ease. We illustrate our methodology using real and simulated data.

We explore the problem of defining and fitting models of autoregressive time series of probability distributions on a compact interval of $\R$. In this context, an order-$1$ autoregressive model is a Markov chain that specifies a certain structure (regression) for the one-step conditional Fr'echet mean with respect to a natural probability metric. We construct and investigate different models based on iterated random function systems of optimal transport maps. While the properties and interpretation of these models depend on how they relate to the iterated transport system, they can all be analyzed theoretically in a unified way. We present such a theoretical analysis, including convergence rates, and illustrate our methodology using real and simulated data. Our models generalise or extend certain existing models of transportation-based regression and autoregression, and in doing so also provides some new insights on those previous models.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-9780
Author(s)
Ghodrati, Laya  
Advisors
Panaretos, Victor  
Jury

Prof. Maria Colombo (présidente) ; Prof. Victor Panaretos (directeur de thèse) ; Dr Yoav Zemel, Prof. Elsa Cazelles, Prof. Alex Petersen (rapporteurs)

Date Issued

2023

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2023-09-19

Thesis number

9780

Total of pages

130

Subjects

Distributional Regression

•

Distributional Time Series

•

Optimal Transport

•

Wasserstein Metric

EPFL units
SMAT  
Faculty
SB  
School
MATHAA  
Doctoral School
EDMA  
Available on Infoscience
September 19, 2023
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/200773
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés