Masked Whole-Body Controller for Humanoid Robots

Pranay Dugar             Aayam Shrestha             Fangzhou Yu             Bart van Marum             Alan Fern


We introduce the Masked Humanoid Controller (MHC) for whole-body tracking of target trajectories over arbitrary subsets of humanoid state variables. This enables the realization of whole-body motions from diverse sources such as video, motion capture, and VR, while ensuring balance and robustness against disturbances. The MHC is trained in simulation using a carefully designed curriculum that imitates partially masked motions from a library of behaviors spanning pre-trained policy rollouts, optimized reference trajectories, re-targeted video clips, and human motion capture data. We showcase simulation experiments validating the MHC's ability to execute a wide variety of behavior from partially-specified target motions. Moreover, we also highlight sim-to-real transfer as demonstrated by real-world trials on the Digit humanoid robot. To our knowledge, this is the first instance of a learned controller that can realize whole-body control of a real-world humanoid for such diverse multi-modal targets.

Masked Controller

A controller that can take different combination of inputs and execute whole body motions on a real humanoid



Upper Body Mimicry (Lower body masked)

Whole Body Mimicry

Our Approach

  • Dataset of human motions retargeted to Humanoid
  • Target lookahead created by masking components of a sample motion
  • Controller (MHC) trained to follow partial commands
  • Trained in simulation following a curriculum
  • Dynamics randomization applied during training for transfer to Real
Interpolate start reference image.

Training Curriculum

Interpolate start reference image.

MHC in Simulation

A vast set of motions that MHC is able to replicate in simulation using a single masked controller


Turn in-place

Wide Turn

Zombie walk

Defend & Jab

Lead Hook

Upper Cut

Rear Upper Cut


Hand Wave

Golf Drive


Reallusion Shield Walk

Reallusion Taunt

Reallusion Sword swing

Reallusion Shield swing

Box Pick-Up

Upper Body + Locomotion

(The controller has no perception capabilities and is blindly following the given motion trajectories without any knowledge of the box in hand.)


        title={Learning Multi-Modal Whole-Body Control for Real-World Humanoid Robots}, 
        author={Pranay Dugar and Aayam Shrestha and Fangzhou Yu and Bart van Marum and Alan Fern},