196322
20190812205738.0
CONF
Robust Data-Driven Dynamic Programming
2013
2013
Conference Papers
In stochastic optimal control the distribution of the exogenous noise is typically unknown and must be inferred from limited data before dynamic programming (DP)-based solution schemes can be applied. If the conditional expectations in the DP recursions are estimated via kernel regression, however, the historical sample paths enter the solution procedure directly as they determine the evaluation points of the cost-to-go functions. The resulting data-driven DP scheme is asymptotically consistent and admits efficient computational solution when combined with parametric value function approximations. If training data is sparse, however, the estimated cost-to-go functions display a high variability and an optimistic bias, while the corresponding control policies perform poorly in out-of-sample tests. To mitigate these small sample effects, we propose a robust data-driven DP scheme, which replaces the expectations in the DP recursions with worst-case expectations over a set of distributions close to the best estimate. We show that the arising min-max problems in the DP recursions reduce to tractable conic programs. We also demonstrate that this robust algorithm dominates state-of-the-art benchmark algorithms in out-of-sample tests across several application domains.
Hanasusanto, Grani A.
247589
Kuhn, Daniel
239987
Neural Information Processing Systems
Lake Tahoe, USA
December 2013
Burges, C. J. C.
ed.
Bottou, L.
ed.
Welling, M.
ed.
Ghahramani, Z.
ed.
Weinberger, K. Q.
ed.
NIPS Proceedings 26
daniel.kuhn@epfl.ch
http://papers.nips.cc/paper/5123-robust-data-driven-dynamic-programming.pdf
http://papers.nips.cc/paper/5123-robust-data-driven-dynamic-programming
URL
252496
RAO
U12788
oai:infoscience.tind.io:196322
CDM
conf
GLOBAL_SET
112541
EPFL-CONF-196322
EPFL
REVIEWED
CONF