Robust Reinforcement Learning via Adversarial training with Langevin Dynamics

Parameswaran, Kamalaruban; Huang, Yu-Ting; Hsieh, Ya-Ping; Rolland, Paul Thierry Yves; Shi, Cheng; Cevher, Volkan

Parameswaran, Kamalaruban; Huang, Yu-Ting; Hsieh, Ya-Ping; Rolland, Paul Thierry Yves; Shi, Cheng; Cevher, Volkan

2020

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

We introduce a sampling perspective to tackle the challenging task of training robust Reinforcement Learning (RL) agents. Leveraging the powerful Stochastic Gradient Langevin Dynamics, we present a novel, scalable two-player RL algorithm, which is a sampling variant of the two-player policy gradient method. Our algorithm consistently outperforms existing baselines, in terms of generalization across different training and testing conditions, on several MuJoCo environments. Our experiments also show that, even for objective functions that entirely ignore potential environmental shifts, our sampling approach remains highly robust in comparison to standard RL algorithms.

Details

Title Robust Reinforcement Learning via Adversarial training with Langevin Dynamics

Author(s) Parameswaran, Kamalaruban ; Huang, Yu-Ting ; Hsieh, Ya-Ping ; Rolland, Paul Thierry Yves ; Shi, Cheng ; Cevher, Volkan

Pagination 46

Date 2020-11-05

Publisher Vancouver, Canada, 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

Keywords

ml-ai; Deep Reinforcement Learning; Robustness; Robust MDP; Markov Games

Note ml-ai

Additional link arXiv

Laboratories LIONS

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIONS - Laboratory for Information and Inference Systems
Peer-reviewed publications
Work produced at EPFL
Technical Reports
Accepted

Record creation date 2020-02-17

Actions

Preview

Select file: