Multi-Agent Learning for Resource Allocation Problems

Faltings, BoiCigler, Luděk2013-01-142013-01-142013-01-14201310.5075/epfl-thesis-5599https://infoscience.epfl.ch/handle/20.500.14299/87762urn:nbn:ch:bel-epfl-thesis5599-3We analyze resource allocation problems where N independent agents want to access C resources. Each resource can be only accessed by one agent at a time. In order to use the resources efficiently, the agents need to coordinate their access. We focus on decentralized coordination, i.e. coordination without central authority. We analyze coordination mechanisms for two different kind of agents: 1) cooperative, who follow the prescribed protocol entirely; and 2) non-cooperative, who attempt to maximize their own utility. We propose a novel approach to achieve a fair and efficient resource allocation when the agents are cooperative. The agents access resources in slots. At the beginning of each slot, they observe a global coordination signal – a random integer 1 ≤ k ≤ K . The agents then learn a different allocation for each value of the coordination signal. When modeled as a game, the canonical solution to the resource allocation problem is the correlated equilibrium. In a correlated equilibrium, a “smart” coordination device chooses the actions for the “stupid” agents, who then have no incentive to deviate from these recommendations. In contrast, in our solution the coordination signal is “stupid”, since it is not specific to the game. The agents are “smart”, because they learn their strategy independently for each signal value. We show that our learning algorithm converges to an efficient resource allocation in polynomial time. The resulting allocation becomes more fair as the number of signals K increases. A non-cooperative, self-interested agent can exploit our cooperative allocation scheme by accessing one resource all the time, until everyone else gives up. Therefore, for the non-cooperative agents, we consider an infinitely repeated resource allocation game with discounting. This game is symmetric and all the agents are identical, so we look for its symmetric subgame-perfect equilibria. (Bhaskar (2000)) proposed a solution for 2-agent, 1-resource allocation games: Agents start by symmetrically choosing their actions randomly, and as soon as they each choose different actions, they start to follow a convention that prescribes their actions from then on. We extend the concept of convention to the general resource allocation problems of N agents and C resources. We show that for any convention, there exists a symmetric subgame-perfect equilibrium that implements it. We present two conventions: bourgeois, where agents stick to the first allocation; and market, where agents pay for the use of resources, and observe a global coordination signal that allows them to alternate between different allocations. We define the price of anonymity of a convention as the ratio between the maximum social payoff of any (asymmetric) strategy profile and the expected social payoff of the convention. We show that the price of anonymity of the bourgeois convention is infinite. The market convention decreases this price by reducing the conflict between the agents.enResource allocationMulti-agent LearningGame TheorySymmetric EquilibriaCorrelated EquilibriaEfficiencyFairnessMulti-Agent Learning for Resource Allocation Problemsthesis::doctoral thesis