Generative Models for Learning Robot Manipulation Skills from Humans

A long standing goal in artificial intelligence is to make robots seamlessly interact with humans in performing everyday manipulation skills. Learning from demonstrations or imitation learning provides a promising route to bridge this gap. In contrast to direct trajectory learning from demonstrations, many problems arise in interactive robotic applications that require higher contextual level understanding of the environment. This requires learning invariant mappings in the demonstrations that can generalize across different environmental situations such as size, position, orientation of objects, viewpoint of the observer, etc. In this thesis, we address this challenge by encapsulating invariant patterns in the demonstrations using probabilistic learning models for acquiring dexterous manipulation skills. We learn the joint probability density function of the demonstrations with a hidden semi-Markov model, and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The model exploits the invariant segments (also termed as sub-goals, options or actions) in the demonstrations and adapts the movement in accordance with the external environmental situations such as size, position and orientation of the objects in the environment using a task-parameterized formulation. We incorporate high-dimensional sensory data for skill acquisition by parsimoniously representing the demonstrations using statistical subspace clustering methods and exploit the coordination patterns in latent space. To adapt the models on the fly and/or teach new manipulation skills online with the streaming data, we formulate a non-parametric scalable online sequence clustering algorithm with Bayesian non-parametric mixture models to avoid the model selection problem while ensuring tractability under small variance asymptotics. We exploit the developed generative models to perform manipulation skills with remotely operated vehicles over satellite communication in the presence of communication delays and limited bandwidth. A set of task-parameterized generative models are learned from the demonstrations of different manipulation skills provided by the teleoperator. The model captures the intention of teleoperator on one hand and provides assistance in performing remote manipulation tasks on the other hand under varying environmental situations. The assistance is formulated under time-independent shared control, where the model continuously corrects the remote arm movement based on the current state of the teleoperator; and/or time-dependent autonomous control, where the model synthesizes the movement of the remote arm for autonomous skill execution. Using the proposed methodology with the two-armed Baxter robot as a mock-up for semi-autonomous teleoperation, we are able to learn manipulation skills such as opening a valve, pick-and-place an object by obstacle avoidance, hot-stabbing (a specialized underwater task akin to peg-in-a-hole task), screw-driver target snapping, and tracking a carabiner in as few as 4 - 8 demonstrations. Our study shows that the proposed manipulation assistance formulations improve the performance of the teleoperator by reducing the task errors and the execution time, while catering for the environmental differences in performing remote manipulation tasks with limited bandwidth and communication delays.


Advisor(s):
Bourlard, Hervé
Calinon, Sylvain
Year:
2018
Publisher:
Lausanne, EPFL
Keywords:
Laboratories:




 Record created 2018-01-25, last modified 2019-05-09

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)