Overlapping Multi-Bandit Best Arm Identification

In the multi-armed bandit literature, the multi-bandit best-arm identification problem consists of determining each best arm in a number of disjoint groups of arms, with as few total arm pulls as possible. In this paper, we introduce a variant of the multi-bandit problem with overlapping groups, and present two algorithms for this problem based on successive elimination and lower/upper confidence bounds (LUCB). We bound the number of total arm pulls required for high-probability best-arm identification in every group, and we complement these bounds with a near-matching algorithm-independent lower bound. In addition, we show that a specific choice of the groups recovers the top-k ranking problem.

Presented at:
The 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, July 7-12, 2019

 Record created 2019-04-01, last modified 2019-08-12

MultiBandit_FULL - Download fulltextPDF
MultiBandit_ISIT2019 - Download fulltextPDF
Rate this document:

Rate this document:
(Not yet reviewed)