ManipTrans
Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning

CVPR'25
1Beijing Institute for General Artificial Intelligence (BIGAI)2Department of Automation, Tsinghua University3Institute for Artificial Intelligence, Peking University

Abstract

Human hands play a central role in interacting, motivating increasing research in dexterous robotic manipulation. Data-driven embodied AI algorithms demand precise, large-scale, human-like manipulation sequences, which are challenging to obtain with conventional reinforcement learning or real-world teleoperation. To address this, we introduce ManipTrans, a novel two-stage method for efficiently transferring human bimanual skills to dexterous robotic hands in simulation. ManipTrans first pre-trains a generalist trajectory imitator to mimic hand motion, then fine-tunes a specific residual module under interaction constraints, enabling efficient learning and accurate execution of complex bimanual tasks. Experiments show that ManipTrans surpasses state-of-the-art methods in success rate, fidelity, and efficiency. Leveraging ManipTrans, we transfer multiple hand-object datasets to robotic hands, creating DexManipNet, a large-scale dataset featuring previously unexplored tasks like pen capping and bottle unscrewing. DexManipNet comprises 3.3K episodes of robotic manipulation and is easily extensible, facilitating further policy training for dexterous hands and enabling real-world deployments.

Method

Method
Our goal is to learn a policy that enables dexterous robotic hands to accurately replicate given human hand–object interaction trajectories in simulation, while satisfying the task's semantic manipulation constraints. The key insight of ManipTrans is to approach this transfer as a two-stage process: first, a pre-training trajectory imitation stage focusing solely on hand motion, and second, a specific action fine-tuning stage that addresses interaction constraints. In the initial stage, we develop a robust generalist model, pre-trained on large-scale human demonstrations, which learns to mimic human hand motions accurately. Building upon this, we then employ a residual learning module to refine the robot's actions incrementally. This refinement concentrates on two critical aspects: (1) ensuring stable contact with object surfaces under physical constraints to enable effective object manipulation, and (2) coordinating both hands to achieve precise, high-fidelity execution of complex bimanual operations.

Simulation Results

Single hand manipulation results.
Bimanual manipulation results.

Real-world Results

Use custom-designed 90° flange.
Directly mount the dexterous hand.

Citation

@inproceedings{li2025maniptrans,
  title={ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning},
  author={Li, Kailin and Li, Puhao and Liu, Tengyu and Li, Yuyang and Huang, Siyuan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}