Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Brain signals of a Surprise-Actor-Critic model: Evidence for multiple learning modules in human decision making
 
research article

Brain signals of a Surprise-Actor-Critic model: Evidence for multiple learning modules in human decision making

Liakoni, Vasiliki  
•
Lehmann, Marco P.
•
Modirshanechi, Alireza  
Show more
February 1, 2022
Neuroimage

Learning how to reach a reward over long series of actions is a remarkable capability of humans, and potentially guided by multiple parallel learning modules. Current brain imaging of learning modules is limited by (i) simple experimental paradigms, (ii) entanglement of brain signals of different learning modules, and (iii) a limited number of computational models considered as candidates for explaining behavior. Here, we address these three limitations and (i) introduce a complex sequential decision making task with surprising events that allows us to (ii) dissociate correlates of reward prediction errors from those of surprise in functional magnetic resonance imaging (fMRI); and (iii) we test behavior against a large repertoire of model-free, model-based, and hybrid reinforcement learning algorithms, including a novel surprise-modulated actor-critic algorithm. Surprise, derived from an approximate Bayesian approach for learning the world-model, is extracted in our algorithm from a state prediction error. Surprise is then used to modulate the learning rate of a model-free actor, which itself learns via the reward prediction error from model-free value estimation by the critic. We find that action choices are well explained by pure model-free policy gradient, but reaction times and neural data are not. We identify signatures of both model-free and surprise-based learning signals in blood oxygen level dependent (BOLD) responses, supporting the existence of multiple parallel learning modules in the brain. Our results extend previous fMRI findings to a multi-step setting and emphasize the role of policy gradient and surprise signalling in human learning.

  • Details
  • Metrics
Type
research article
DOI
10.1016/j.neuroimage.2021.118780
Web of Science ID

WOS:000736989900003

Author(s)
Liakoni, Vasiliki  
Lehmann, Marco P.
Modirshanechi, Alireza  
Brea, Johanni  
Lutti, Antoine
Gerstner, Wulfram  
Preuschoff, Kerstin  
Date Issued

2022-02-01

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE

Published in
Neuroimage
Volume

246

Article Number

118780

Subjects

Neurosciences

•

Neuroimaging

•

Radiology, Nuclear Medicine & Medical Imaging

•

Neurosciences & Neurology

•

reinforcement learning

•

surprise

•

human learning

•

sequential decision making

•

behavior

•

fmri

•

carlo sampling methods

•

reward prediction

•

bayesian surprise

•

neural mechanisms

•

basal ganglia

•

reinforcement

•

selection

•

striatum

•

neurons

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LCN  
Available on Infoscience
January 31, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/184987
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés