ABOUT US

2022.10.10

Twisting Lids Off with Two Hands

Toru Lin*   Zhao-Heng Yin*   Haozhi Qi   Pieter Abbeel   Jitendra Malik

UC Berkeley

TLDR We train two robot hands to twist bottle lids using deep RL and sim-to-real.

Overview

Manipulating objects with two multi-fingered hands has been a long-standing challenge in robotics, attributed to the contact-rich nature of many manipulation tasks and the complexity inherent in coordinating a high-dimensional bimanual system. In this work, we consider the problem of twisting lids of various bottle-like objects with two hands, and demonstrate that policies trained in simulation using deep reinforcement learning can be effectively transferred to the real world. With novel engineering insights into physical modeling, real-time perception, and reward design, the policy demonstrates generalization capabilities across a diverse set of unseen objects, showcasing dynamic and dexterous behaviors. Our findings serve as compelling evidence that deep reinforcement learning combined with sim-to-real transfer remains a promising approach for addressing manipulation problems of unprecedented complexity.

Robustness

During policy deployment, we perturb objects at random times by poking or pushing them along random directions using a picker tool or a hand. Our policy is robust against these random external forces, and can adapt quickly to sustain continuous manipulation. Both videos show how our policy can reorient and translate a perturbed object back to a stable in-hand pose.

Emergent Behavior

Our policy exhibits interesting emergent behaviors that maintain its robustness when deployed on objects that are significantly different from those in the training distribution. We observe that our policy can skillfully adjust the finger gaits and grasps of both hands to recover objects from unstable states back to stable poses. Our policy also adapts its movements to objects of different shapes and sizes.

Perception

3D object keypoints extracted from RGBD images as object representation. Specifically, we generate two separate segmentation masks for the bottle body and lid on the first frame, and track the masks throughout all remaining frames. […]



Bibtex

Acknowledgements

We thank Chen Wang and Yuzhe Qin for helpful discussions on hardware setup and simulation of the Allegro Hand. TL is supported by fellowships from the National Science Foundation and UC Berkeley. ZY is supported by funding from InnoHK Centre for Logistics Robotics and ONR MURI N00014-22-1-2773. HQ is supported by the DARPA Machine Common Sense and ONR MURI N00014-21-1-2801.

Website template edited from TILTED

Original Article