Team Policy Learning For Multi-Agent Reinforcement Learning

Cassano, Lucas; Alghunaim, Sulaiman A.; Sayed, Ali H.

doi:10.1109/ICASSP.2019.8683168

conference paper

Team Policy Learning For Multi-Agent Reinforcement Learning

Cassano, Lucas

•

Alghunaim, Sulaiman A.

•

Sayed, Ali H.

January 1, 2019

2019 Ieee International Conference On Acoustics, Speech And Signal Processing (Icassp)

44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This work presents a fully distributed algorithm for learning the optimal policy in a multi-agent cooperative reinforcement learning scenario. We focus on games that can only be solved through coordinated team work. We consider situations in which K players interact simultaneously with an environment and with each other to attain a common goal. In the algorithm, agents only communicate with other agents in their immediate neighborhood and choose their actions independently of one another based only on local information. Learning is done off-policy, which results in high data efficiency. The proposed algorithm is of the stochastic primal-dual kind and can be shown to converge even when used in conjunction with a wide class of function approximators.

Type

conference paper

DOI

10.1109/ICASSP.2019.8683168

Web of Science ID

WOS:000482554003057

Author(s)

Cassano, Lucas

Alghunaim, Sulaiman A.

Sayed, Ali H.

Date Issued

2019-01-01

Publisher

IEEE

Publisher place

New York

Published in

2019 Ieee International Conference On Acoustics, Speech And Signal Processing (Icassp)

ISBN of the book

978-1-4799-8131-1

Start page

3062

End page

3066

Subjects

reinforcement learning

•

multi-agent learning

•

off-policy

•

optimal policy

•

distributed algorithm

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

ASL

Event name	Event place	Event date
44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)	Brighton, ENGLAND	May 12-17, 2019

Available on Infoscience

September 26, 2019

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/161554