Estimates of Parameter Distributions for Optimal Action Selection

We present a general method for maintaining estimates of the distribution of parameters in arbitrary models. This is then applied to the estimation of probability distribution over actions in value-based reinforcement learning. While this approach is similar to other techniques that maintain a confidence measure for action-values, it nevertheless offers a new insight into current techniques and reveals potential avenues of further research.


Year:
2004
Publisher:
IDIAP
Keywords:
Laboratories:




 Record created 2006-03-10, last modified 2018-01-27

External link:
Download fulltext
URL
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)