Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Truly No-Regret Learning in Constrained MDPs
 
conference paper

Truly No-Regret Learning in Constrained MDPs

Müller, Adrian
•
Alatur, Pragnya
•
Cevher, Volkan  orcid-logo
Show more
February 24, 2024
Proceedings of the International Conference on Machine Learning, 21-27 July 2024, Vienna, Austria
41st International Conference on Machine Learning (ICML 2024)

Constrained Markov decision processes (CMDPs) are a common way to model safety constraints in reinforcement learning. State-of-the-art methods for efficiently solving CMDPs are based on primal-dual algorithms. For these algorithms, all currently known regret bounds allow for error cancellations -- one can compensate for a constraint violation in one round with a strict constraint satisfaction in another. This makes the online learning process unsafe since it only guarantees safety for the final (mixture) policy but not during learning. As Efroni et al. (2020) pointed out, it is an open question whether primal-dual algorithms can provably achieve sublinear regret if we do not allow error cancellations. In this paper, we give the first affirmative answer. We first generalize a result on last-iterate convergence of regularized primal-dual schemes to CMDPs with multiple constraints. Building upon this insight, we propose a model-based primal-dual algorithm to learn in an unknown CMDP. We prove that our algorithm achieves sublinear regret without error cancellations.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

muller24b.pdf

Type

Main Document

Version

Published version

Access type

openaccess

License Condition

CC BY

Size

1.59 MB

Format

Adobe PDF

Checksum (MD5)

d4febb3559d7ea0ff0c316b5daed2866

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés