Budgeted Knowledge Transfer for State-wise Heterogeneous RL Agents
In this paper we introduce a budgeted knowledge transfer algorithm for non-homogeneous reinforcement learning agents. Here the source and the target agents are completely identical except in their state representations. The algorithm uses functional space (Q-value space) as the transfer-learning media. In this method, the target agent’s functional points (Q-values) are estimated in an automatically selected lower-dimension subspace in order to accelerate knowledge transfer. The target agent searches that subspace using an exploration policy and selects actions accordingly during the period of its knowledge transfer in order to facilitate gaining an appropriate estimate of its Q-table. We show both analytically and empirically that this method decreases the required learning budget for the target agent.