Learning search behaviour from humans
A frequent method for taking into account the partially observable nature of an environment in which robots interact lies in formulating the problem domain as a Partially Observable Markov Decision Process (POMDP). By having humans demonstrate how to act in this partially observable context we can leverage their prior knowledge, experience and intuition, which is difficult to encode directly in a controller, to solve a task formulated as a POMDP. In this work we learn search behaviours from human demonstrators and transfer this knowledge to a robot in a context where no visual information is available. The task consists of finding a block on a table. This is a non-trivial problem since no visual information is available and as a result, the belief of the demonstrator’s state (position in the environment) has to be inferred. We show that by representing the belief of the human’s position in the environment by a particle filter (PF) and learning a mapping from this belief to their end-effector velocities with a Gaussian Mixture Model (GMM), we model the human’s search process. We compare the different types of search behaviour demonstrated by the humans to that of our learned model, to validate that the search process has been successfully modelled. We then contrast the performance of this human-inspired search model to a greedy controller and show that (similarly to humans) the learned controller minimises uncertainty, hence demonstrating more robustness in the face of false belief.