Autonomous reinforcement learning with experience replay

Wawrzynski, Pawel; Tanwani, Ajay Kumar

doi:10.1016/j.neunet.2012.11.007

2013

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time. (c) 2012 Elsevier Ltd. All rights reserved.

Details

Title Autonomous reinforcement learning with experience replay

Author(s) Wawrzynski, Pawel ; Tanwani, Ajay Kumar

Published in Neural Networks

Pagination 12

Volume 41

Pages 156-167

Date 2013

Publisher Oxford, Pergamon-Elsevier Science Ltd

ISSN 0893-6080

Keywords

Actor–critic; Reinforcement learning; Autonomous learning; Step-size estimation; Actor-critic

Note Special Issue on Autonomous Learning

DOI https://doi.org/10.1016/j.neunet.2012.11.007

Other identifier(s) View record in Web of Science

Laboratories LASA
LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LASA - Learning algorithms and systems Laboratory
Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2013-10-01