AN EXACT BANDIT MODEL FOR THE RISK-VOLATILITY TRADEOFF

Hongler, Max-OlivierRivier, Renaud2024-05-162024-05-162024-05-162024-04-2410.3934/jdg.2024011https://infoscience.epfl.ch/handle/20.500.14299/208001WOS:001216199000001. We revisit the two-armed bandit (TAB) problem where both arms are driven by diffusive stochastic processes with a common instantaneous reward. We focus on situations where the Radon-Nikodym derivative between the transition probability densities of the first arm with respect to the second is explicitly known. We calculate how the corresponding Gittins' indices behave under such a change of probability measure. This general framework is used to solve the optimal allocation of a TAB problem where the first arm is driven by a pure Brownian motion and the second is driven by a centered super-diffusive nonGaussian process with variance quadratically growing in time. The probability spread due to the super-diffusion introduces an extra risk into the allocation problem. This drastically affects the optimal decision rule. Our modeling illustrates the interplay between the notions of risk and volatility.Physical SciencesSequential Stochastic OptimizationContinuous Time Multi-Armed BanditsDiffusion ProcessesNon-Gaussian EvolutionsMean Preserving Spread.AN EXACT BANDIT MODEL FOR THE RISK-VOLATILITY TRADEOFFtext::journal::journal article::research article