Game-Theoretic Decision Making in Multi-Agent Systems under Constraints and Welfare Objectives
Interactions of multiple agents lie at the heart of many real-world systems, including transportation networks, auctions, electricity markets, and multi-robot systems. Game theory provides a natural framework for modeling such strategic interactions. Historically, research in this field has focused on the existence and computation of equilibria, as well as on learning dynamics that allow agents to converge to stable outcomes. However, when applying such game-theoretic tools to real-world systems, several challenges emerge: In many settings, agents face constraints that render certain decisions infeasible. This is relevant in domains where agents are limited by shared resources, safety and fairness requirements, or physical constraints. A second challenge concerns system-level efficiency, as strategic behavior frequently leads to socially inefficient outcomes, such as congestion or resource overuse. In this thesis, we aim to broaden the applicability of the game-theoretic framework by addressing these challenges through novel equilibrium concepts and decentralized learning dynamics.
The first part of the thesis studies multi-agent systems subject to constraints. In Chapter 3, we consider coupling constraints, where an agent's feasibility depends on the joint strategy profile. We formalize the notion of a constrained correlated equilibrium and establish sufficient conditions for its existence in Markov games. In Chapter 4, we address the problem of learning in contextual games with private constraints, where feasibility is determined by an agent's own decision set. We propose a learning algorithm and derive sublinear convergence bounds for both cumulative regret and constraint violations. The effectiveness of this approach is demonstrated through a multi-building temperature control case study.
The second part of the thesis focuses on efficiency in multi-agent systems with welfare objectives. In Chapter 5, we study how agents can be steered toward socially desirable outcomes within a Stackelberg framework without prior knowledge of the agents' reward functions. We develop a no-regret learning algorithm for the leader and prove convergence to a Stackelberg equilibrium when followers play approximate Nash equilibria. This approach is evaluated in an electric ride-hailing pricing study. Finally, in Chapter 6, we consider the problem of equilibrium selection in potential games. While log-linear learning is known to converge asymptotically to efficient Nash equilibria, we provide the first finite-time convergence guarantees for these dynamics and establish robustness under reduced feedback and utility perturbations.
EPFL_TH11663.pdf
Main Document
Published version
openaccess
N/A
8.01 MB
Adobe PDF
c1816a04683d917acc2b8de7c2c1600a