Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. A Natural Actor-Critic Framework for Zero-Sum Markov Games
 
conference paper

A Natural Actor-Critic Framework for Zero-Sum Markov Games

Alacaoglu, Ahmet
•
Viano, Luca  
•
He, Niao
Show more
2022
International Conference on Machine Learning, 17-23 July 2022, Baltimore, Maryland, USA
39th International Conference on Machine Learning (ICML)

We introduce algorithms based on natural actorcritic and analyze their sample complexity for solving two player zero-sum Markov games in the tabular case. Our results improve the best-known sample complexities of policy gradient/actorcritic methods for convergence to Nash equilibrium in the multi-agent setting. We use the error propagation scheme in approximate dynamic programming, recent advances for global convergence of policy gradient methods, temporal difference learning, and techniques from stochastic primal-dual optimization. Our algorithms feature two stages, requiring agents to agree on an etiquette before starting their interactions, which is feasible for instance in self-play. However, the agents only access to joint reward and joint next state and not to each other’s actions or policies. Our complexity results match the best-known results for global convergence of policy gradient algorithms for single agent RL. We provide numerical verification of our methods for a two player bandit environment and a two player game, Alesia. We observe improved empirical performance as compared to the recently proposed optimistic gradient descent-ascent variant for Markov games.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

alacaoglu22a.pdf

Type

Postprint

Version

http://purl.org/coar/version/c_ab4af688f83e57aa

Access type

openaccess

License Condition

copyright

Size

809.41 KB

Format

Adobe PDF

Checksum (MD5)

44d6a1d4a74bf33e9ee0540125cca1f9

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés