Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Preprints and Working Papers
  4. Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits
 
preprint

Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits

Li, Mengmeng  
•
Kuhn, Daniel  
•
Taskesen, Bahar  
September 30, 2024

Follow-The-Regularized-Leader (FTRL) algorithms often enjoy optimal regret for adversarial as well as stochastic bandit problems and allow for a streamlined analysis. However, FTRL algorithms require the solution of an optimization problem in every iteration and are thus com- putationally challenging. In contrast, Follow-The-Perturbed-Leader (FTPL) algorithms achieve computational efficiency by perturbing the estimates of the rewards of the arms, but their regret analysis is cumbersome. We propose a new FTPL algorithm that generates optimal policies for both adversarial and stochastic multi-armed bandits. Similar to FTRL, our algorithm admits a unified regret analysis, and similar to FTPL, it offers low computational costs. Unlike existing FTPL algorithms that rely on independent additive disturbances governed by a known distribution, we allow for disturbances governed by an ambiguous distribution that is only known to belong to a given set and propose a principle of optimism in the face of ambiguity. Consequently, our framework generalizes existing FTPL algorithms. It also encapsulates a broad range of FTRL methods as special cases, including several optimal ones, which appears to be impossible with current FTPL methods. Finally, we use techniques from discrete choice theory to devise an efficient bisection algorithm for computing the optimistic arm-sampling probabilities. This algorithm is up to 10^4 times faster than standard FTRL algorithms that solve an optimization problem in every iteration. Our results not only settle existing conjectures but also provide new insights into the impact of perturbations by mapping FTRL to FTPL.

  • Details
  • Metrics
Type
preprint
ArXiv ID

2409.20440

Author(s)
Li, Mengmeng  

EPFL

Kuhn, Daniel  

EPFL

Taskesen, Bahar  
Date Issued

2024-09-30

Subjects

Computer Science - Learning

•

Statistics - Machine Learning

Subjects arXiv

cs.LG

•

stat.ML

Editorial or Peer reviewed

NON-REVIEWED

Written at

EPFL

EPFL units
RAO  
Available on Infoscience
February 14, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/246911
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés