Learning from Failed Demonstrations in Unreliable Systems
This paper presents a method to teach a robot to play Ping Pong from failed demonstrations in a highly noisy and uncertain setting. To infer useful information from failed demonstrations, we use a MultiDonut Algorithm  that minimises the probability of repeating a failed demonstration and generates new attempts similar but not quite the same as the demonstration. We compare human demonstrations against a random strategy and show that human demonstrations provide useful information and hence yield faster learning, especially in higher dimensions. We show that learning from observing failed attempts allows the robot to perform the task more reliably than any individual demonstrator did. We also show how this algorithm adapts to gradual deterioration in the system and increases the chances of success when interacting with an unreliable system.