Exploiting environmental signals to enable policy correlation in large-scale decentralized systems

Danassis, Panayiotis; Erden, Zeki Doruk; Faltings, Boi

doi:10.1007/s10458-021-09541-7

Danassis, Panayiotis; Erden, Zeki Doruk; Faltings, Boi

2022

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Can artificial agents benefit from human conventions? Human societies manage to successfully self-organize and resolve the tragedy of the commons in common-pool resources, in spite of the bleak prediction of non-cooperative game theory. On top of that, real-world problems are inherently large-scale and of low observability. One key concept that facilitates human coordination in such settings is the use of conventions. Inspired by human behavior, we investigate the learning dynamics and emergence of temporal conventions, focusing on common-pool resources. Extra emphasis was given in designing a realistic evaluation setting: (a) environment dynamics are modeled on real-world fisheries, (b) we assume decentralized learning, where agents can observe only their own history, and (c) we run large-scale simulations (up to 64 agents). Uncoupled policies and low observability make cooperation hard to achieve; as the number of agents grow, the probability of taking a correct gradient direction decreases exponentially. By introducing an arbitrary common signal (e.g., date, time, or any periodic set of numbers) as a means to couple the learning process, we show that temporal conventions can emerge and agents reach sustainable harvesting strategies. The introduction of the signal consistently improves the social welfare (by 258% on average, up to 3306%), the range of environmental parameters where sustainability can be achieved (by 46% on average, up to 300%), and the convergence speed in low abundance settings (by 13% on average, up to 53%).

Details

Title Exploiting environmental signals to enable policy correlation in large-scale decentralized systems

Author(s) Danassis, Panayiotis ; Erden, Zeki Doruk ; Faltings, Boi

Published in Autonomous Agents And Multi-Agent Systems

Volume 36

Issue 1

Pages 13

Date 2022-04-01

Publisher Dordrecht, SPRINGER

ISSN 1387-2532
1573-7454

Keywords

multi-agent deep reinforcement learning; coordination; resource allocation; sustainability; social conventions; social dilemmas; tragedy

DOI https://doi.org/10.1007/s10458-021-09541-7

Other identifier(s) View record in Web of Science

Laboratories LIA

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LIA - Artificial Intelligence Laboratory
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2022-02-14

Actions

Preview

Select file: