A Dynamical System-based Approach to Modeling Stable Robot Control Policies via Imitation Learning

Despite tremendous advances in robotics, we are still amazed by the proficiency with which humans perform movements. Even new waves of robotic systems still rely heavily on hardcoded motions with a limited ability to react autonomously and robustly to a dynamically changing environment. This thesis focuses on providing possible mechanisms to push the level of adaptivity, reactivity, and robustness of robotic systems closer to human movements. Specifically, it aims at developing these mechanisms for a subclass of robot motions called “reaching movements”, i.e. movements in space stopping at a given target (also referred to as episodic motions, discrete motions, or point-to-point motions). These reaching movements can then be used as building blocks to form more advanced robot tasks. To achieve a high level of proficiency as described above, this thesis particularly seeks to derive control policies that: 1) resemble human motions, 2) guarantee the accomplishment of the task (if the target is reachable), and 3) can instantly adapt to changes in dynamic environments. To avoid manually hardcoding robot motions, this thesis exploits the power of machine learning techniques and takes an Imitation Learning (IL) approach to build a generic model of robot movements from a few examples provided by an expert. To achieve the required level of robustness and reactivity, the perspective adopted in this thesis is that a reaching movement can be described with a nonlinear Dynamical System (DS). When building an estimate of DS from demonstrations, there are two key problems that need to be addressed: the problem of generating motions that resemble at best the demonstrations (the “how-to-imitate” problem), and most importantly, the problem of ensuring the accomplishment of the task, i.e. reaching the target (the “stability” problem). Although there are numerous well-established approaches in robotics that could answer each of these problems separately, tackling both problems simultaneously is challenging and has not been extensively studied yet. This thesis first tackles the problem mentioned above by introducing an iterative method to build an estimate of autonomous nonlinear DS that are formulated as a mixture of Gaussian functions. This method minimizes the number of Gaussian functions required for achieving both local asymptotic stability at the target and accuracy in following demonstrations. We then extend this formulation and provide sufficient conditions to ensure global asymptotic stability of autonomous DS at the target. In this approach, an estimation of the underlying DS is built by solving a constraint optimization problem, where the metric of accuracy and the stability conditions are formulated as the optimization objective and constraints, respectively. In addition to ensuring convergence of all motions to the target within the local or global stability regions, these approaches offer an inherent adaptability and robustness to changes in dynamic environments. This thesis further extends the previous approaches and ensures global asymptotic stability of DS-based motions at the target independently of the choice of the regression technique. Therefore, it offers the possibility to choose the most appropriate regression technique based on the requirements of the task at hand without compromising DS stability. This approach also provides the possibility of online learning and using a combination of two or more regression methods to model more advanced robot tasks, and can be applied to estimate motions that are represented with both autonomous and non-autonomous DS. Additionally, this thesis suggests a reformulation to modeling robot motions that allows encoding of a considerably wider set of tasks ranging from reaching movements to agile robot movements that require hitting a given target with a specific speed and direction. This approach is validated in the context of playing the challenging task of minigolf. Finally, the last part of this thesis proposes a DS-based approach to realtime obstacle avoidance. The presented approach provides a modulation that instantly modifies the robot’s motion to avoid collision with multiple static and moving convex obstacles. This approach can be applied on all the techniques described above without affecting their adaptability, swiftness, or robustness. The techniques that are developed in this thesis have been validated in simulation and on different robotic platforms including the humanoid robots HOAP-3 and iCub, and the robot arms KATANA, WAM, and LWR. Throughout this thesis we show that the DS-based approach to modeling robot discrete movements can offer a high level of adaptability, reactivity, and robustness almost effortlessly when interacting with dynamic environments.

Billard, Aude
Lausanne, EPFL
Other identifiers:
urn: urn:nbn:ch:bel-epfl-thesis5552-3

Note: The status of this file is: Anyone

 Record created 2012-12-05, last modified 2020-04-20

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)