Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Parsimonious Online Learning with Kernels and Random Features with Applications to Stochastic Optimal Control
 
doctoral thesis

Parsimonious Online Learning with Kernels and Random Features with Applications to Stochastic Optimal Control

Le Moal, Quentin  
2025

This thesis explores stochastic optimal control through three interconnected contributions: a theoretical framework, and the development of two regression methods, POLK and POLRF, tailored to high-dimensional regression challenges in this domain.

In the first part, we propose a general framework for solving stochastic optimal control problems using dynamic programming. A key contribution is the decomposition of total error at each time step into local approximation errors and propagated errors from future steps. This decomposition provides new theoretical insights into error backpropagation, a relatively underexplored area in stochastic control. While this framework establishes a solid foundation, its applicability is limited to problems amenable to dynamic programming.

The second part introduces POLK (Parsimonious Online Learning with Kernels), a kernel-based regression method enhanced with sparsification to control model complexity. POLK achieves efficient evaluation times and performs well on intricate regression tasks with localised variations. We further improve its empirical convergence speed by incorporating a second gradient term while maintaining theoretical guarantees. However, the computational cost of the sparsification process restricts its scalability to larger datasets, limiting its practical use in high-dimensional settings.

In the third part, we develop POLRF (Parsimonious Online Learning with Random Features), a scalable alternative to kernel-based methods. POLRF replaces kernels with random features, dynamically adapting its feature map distribution through sparsification. This adaptation resembles kernel learning, allowing POLRF to construct problem-specific representations. We establish a theoretical framework for POLRF, including convergence guarantees in a trajectory-dependent reproducing kernel Hilbert space (RKHS). Empirical results on the max-call option pricing problem demonstrate POLRF's competitive performance, including faster convergence and robustness in high dimensions. Despite its promise, open questions remain regarding the properties of the constructed RKHS and the optimisation of random feature distributions for specific problems.

  • Files
  • Details
  • Metrics
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés