Novelty as a drive of human exploration in complex stochastic environments
In order to find extrinsic rewards, humans explore their environment even if exploration requires several intermediate, reward-free decisions. It has been hypothesized that intrinsic rewards, such as novelty, surprise, or information gain, guide this reward-free exploration. However, in artificial agents, different intrinsic reward signals induce exploration strategies that respond differently to stochasticity. In particular, some strategies are vulnerable to the “noisy TV problem,” i.e., an attraction to irrelevant stochastic stimuli. Here, we ask whether humans exhibit a similar attraction to reward-free stochasticity. We design a multistep decision-making paradigm in which participants search for rewarding states in a complex environment containing a highly stochastic but reward-free subregion. We show that i) participants persistently explore the stochastic subregion, and ii) their decisions are best explained by a novelty-driven exploration strategy, compared to alternatives driven by information gain or surprise. Our findings suggest that novelty and extrinsic rewards jointly control human exploration in complex environments.
10.1073_pnas.2502193122.pdf
Main Document
Published version
openaccess
CC BY
5.11 MB
Adobe PDF
02924732f81f69864b6c9b1815487d40