Masked Whole-Body Controller for Humanoid Robots

Pranay Dugar             Aayam Shrestha             Fangzhou Yu             Bart van Marum             Alan Fern






Abstract

We introduce the Masked Humanoid Controller (MHC) for whole-body tracking of target trajectories over arbitrary subsets of humanoid state variables. This enables the realization of whole-body motions from diverse sources such as video, motion capture, and VR, while ensuring balance and robustness against disturbances. The MHC is trained in simulation using a carefully designed curriculum that imitates partially masked motions from a library of behaviors spanning pre-trained policy rollouts, optimized reference trajectories, re-targeted video clips, and human motion capture data. We showcase simulation experiments validating the MHC's ability to execute a wide variety of behavior from partially-specified target motions. Moreover, we also highlight sim-to-real transfer as demonstrated by real-world trials on the Digit humanoid robot. To our knowledge, this is the first instance of a learned controller that can realize whole-body control of a real-world humanoid for such diverse multi-modal targets.

Masked Controller

A controller that can take different combination of inputs and execute whole body motions on a real humanoid

Locomotion

Perturbation

Upper Body Mimicry (Lower body masked)

Whole Body Mimicry

Our Approach

  • Dataset of human motions retargeted to Humanoid
  • Target lookahead created by masking components of a sample motion
  • Controller (MHC) trained to follow partial commands
  • Trained in simulation following a curriculum
  • Dynamics randomization applied during training for transfer to Real
Interpolate start reference image.

Training Curriculum

Interpolate start reference image.

MHC in Simulation

A vast set of motions that MHC is able to replicate in simulation using a single masked controller

Walk

Turn in-place

Wide Turn

Zombie walk


Defend & Jab

Lead Hook

Upper Cut

Rear Upper Cut


Squat

Hand Wave

Golf Drive

Dance


Reallusion Shield Walk

Reallusion Taunt

Reallusion Sword swing

Reallusion Shield swing



Box Pick-Up

Upper Body + Locomotion

(The controller has no perception capabilities and is blindly following the given motion trajectories without any knowledge of the box in hand.)


BibTeX


      @misc{dugar2024mhc,
        title={Learning Multi-Modal Whole-Body Control for Real-World Humanoid Robots}, 
        author={Pranay Dugar and Aayam Shrestha and Fangzhou Yu and Bart van Marum and Alan Fern},
        year={2024},
        eprint={2408.07295},
        archivePrefix={arXiv},
        primaryClass={cs.RO},
        url={https://arxiv.org/abs/2408.07295}, 
  }