Sparse reward processes

Dimitrakakis, Christos

Dimitrakakis, Christos

2012

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is a relation among those tasks, then the information gained during execution of one task has value for the execution of another task. Consequently, the agent is intrinsically moti- vated to explore its environment beyond the degree necessary to solve the current task it has at hand. We develop a decision theoretic setting that generalises standard reinforcement learning tasks and captures this intu- ition. More precisely, we consider a multi-stage stochastic game between a learning agent and an opponent. We posit that the setting is a good model for the problem of life- long learning in uncertain environments, where while resources must be spent learning about currently important tasks, there is also the need to allocate effort towards learning about aspects of the world which are not relevant at the moment. This is due to the fact that unpredictable future events may lead to a change of priorities for the decision maker. Thus, in some sense, the model “explains” the necessity of curiosity. Apart from introducing the general formalism, the paper provides algorithms. These are evaluated experimentally in some exemplary domains. In addition, performance bounds are proven for some cases of this problem.

Details

Title Sparse reward processes

Author(s) Dimitrakakis, Christos

Date 2012

Publisher arxiv

Keywords

learning; motivation; curiosity; adversarial; bandit problems

Laboratories LIA

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LIA - Artificial Intelligence Laboratory
Working papers
Work produced at EPFL
Published

Record creation date 2012-01-17

Actions

Preview

Select file: