A Bio-Inspired Learning and Control Framework for Cross-Embodiment and Cross-Task Locomotion
Animals exhibit remarkable locomotion skills despite significant sensorimotor delays and operating in uncertain environments. Moreover, mammals acquire these skills within minutes of birth. From the Cambrian explosion to the present day, vertebrate motor control circuits have remained remarkably similar. This shared architecture is rooted in a modular and adaptable design, reflecting an elegant system that enables the complexity of locomotion. At the same time, we are living in an exciting era for robotics and artificial intelligence, where roboticists are demonstrating impressive locomotion capabilities for legged robots. In this thesis, we present a bio-inspired learning-based locomotion framework, aiming to use it as a tool to investigate biological hypotheses while iteratively drawing inspiration from animal motor control to enhance the learning performance of legged robots. Our architecture integrates a neural network trained with reinforcement learning that interacts with a central pattern generator (CPG). In the second chapter of this thesis, we introduce viability, i.e., the avoidance of falls, as the primary criterion for gait transitions in quadrupeds. We demonstrate that energy efficiency and the reduction of peak muscle force can serve as secondary criteria for gait transitions. We observe the autonomous emergence of a trot-pronk gait transition without defining the gait while navigating challenging discrete gaps. In the second part, we extend our framework to learn all possible quadrupedal gaits and their transitions within a single policy, without relying on imitation learning or motion priors. Most existing learning-based locomotion frameworks are designed for specialized policies tailored to a single morphology. In the third chapter, we present a learning framework that enables a single policy to generalize across diverse embodiments, including quadrupeds, humanoids, six-legged, and eight-legged robots. Furthermore, we show that a single policy can generate a wide range of gaits for various quadrupeds. In the fourth chapter, we demonstrate how to learn parkour skills using a unified framework without the need to train separate policies for different locomotion tasks. Additionally, we show how CPGs can help mitigate large sensorimotor delays that were not encountered during training, allowing the system to continue performing tasks. Newborn animals, such as baby goats and giraffes, exhibit an innate ability to walk almost immediately after birth, a phenomenon largely attributed to the role of CPGs. In the final part of the thesis, drawing inspiration from animal motor control architectures, we leverage CPGs and Bayesian Optimization (BO) to enable rapid learning of CPG parameters within minutes on hardware, facilitating agile jumping and locomotion.
École Polytechnique Fédérale de Lausanne
Prof. Francesco Mondada (président) ; Prof. Auke Ijspeert (directeur de thèse) ; Prof. Josie Hughes, Prof. Marco Hutter, Prof. Ioannis Havoutis (rapporteurs)
2025
Lausanne
2025-07-04
11385
192