Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

Gast, Nicolas Gabriel; Gaujal, Bruno; Le Boudec, Jean-Yves

Gast, Nicolas Gabriel; Gaujal, Bruno; Le Boudec, Jean-Yves

2010

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating solution to the Markov Decision Process from a solution of the HJB equations. We illustrate the method on three examples pertaining respectively to investment strategies, population dynamics control and scheduling in queues are developed. They are used to illustrate and justify the construction of the controlled ODE and to show the gain obtained by solving a continuous HJB equation rather than a large discrete Bellman equation.

Details

Title Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

Author(s) Gast, Nicolas Gabriel ; Gaujal, Bruno ; Le Boudec, Jean-Yves

Pagination 24

Date 2010

Keywords

Mean field; Optimization; Markov Decision Process; Hamilton Jacobi Bellmann

Laboratories LCA2

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LCA2 - Computer Communications and Applications Laboratory 2
Work produced at EPFL
Technical Reports
Published

Record creation date 2010-04-13

Actions

Preview

Select file: