Autonomous reinforcement learning with experience replay

Wawrzynski, Pawel; Tanwani, Ajay Kumar

doi:10.1016/j.neunet.2012.11.007

research article

Autonomous reinforcement learning with experience replay

Wawrzynski, Pawel

•

Tanwani, Ajay Kumar

2013

Neural Networks

This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time. (c) 2012 Elsevier Ltd. All rights reserved.

Type

research article

DOI

10.1016/j.neunet.2012.11.007

Web of Science ID

WOS:000318209900015

Authors

Wawrzynski, Pawel

•

Tanwani, Ajay Kumar

Publication date

2013

Publisher

Pergamon-Elsevier Science Ltd

Published in

Neural Networks

Volume

41

Start page

156

End page

167

Subjects

Actor–critic

Reinforcement learnin...

Autonomous learning

Step-size estimation

Actor-critic

Note

Special Issue on Autonomous Learning

Peer reviewed

REVIEWED

EPFL units

LASA

LIDIAP

Available on Infoscience

October 1, 2013

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/95449