RL + Model-based Control: Using On-demand Optimal Control to Learn Versatile Legged Locomotion

IEEE Robotics and Automation Letters (RA-L)


This letter presents a control framework that combines model-based optimal control and reinforcement learning (RL) to achieve versatile and robust legged locomotion. Our approach enhances the RL training process by incorporating on-demand reference motions generated through finite-horizon optimal control, covering a broad range of velocities and gaits. These reference motions serve as targets for the RL policy to imitate, leading to the development of robust control policies that can be learned with reliability. Furthermore, by utilizing realistic simulation data that captures whole-body dynamics, RL effectively overcomes the inherent limitations in reference motions imposed by modeling simplifications. We validate the robustness and controllability of the RL training process within our framework through a series of experiments. In these experiments, our method showcases its capability to generalize reference motions and effectively handle more complex locomotion tasks that may pose challenges for the simplified model, thanks to RL’s flexibility. Additionally, our framework effortlessly supports the training of control policies for robots with diverse dimensions, eliminating the necessity for robot-specific adjustments in the reward function and hyperparameters.

Paper: [IEEE Xplore]Open access: [ArXiv]

Supplementary Video


We are pleased to announce that our paper has been selected for presentation at ICRA 2024 in Yokohama, Japan. Please join us for our oral presentation in the "Legged Robot III" session on May 14th from 16:30 to 18:00, and for our poster presentation session earlier that day from 10:30 to 12:00. We are excited to share our work with you and look forward to seeing you there.



  author={Kang, Dongho and Cheng, Jin and Zamora, Miguel and Zargarbashi, Fatemeh and Coros, Stelian},
  journal={IEEE Robotics and Automation Letters}, 
  title={RL + Model-Based Control: Using On-Demand Optimal Control to Learn Versatile Legged Locomotion}, 


This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 866480.)

We express our gratitude to Zijun Hui for his assistance with the robot experiments.