On the Role of Touch for Robustness and Generalisability in Robotic Manipulation

On the Role of Touch for Robustness and Generalisability in Robotic Manipulation

Speaker Name: 
Jeannette Bohg
Speaker Title: 
Assistant Professor of Computer Science
Speaker Organization: 
Stanford University
Start Time: 
Thursday, April 29, 2021 - 2:00pm
End Time: 
Friday, April 2, 2021 - 3:00pm
Ricardo Sanfelice



Learning contact-rich, robotic manipulation skills is a challenging problem due to the high-dimensionality of the state and action space as well as uncertainty from noisy sensors and inaccurate motor control. In our research, we explore what representations of raw perceptual data enable a robot to better learn and perform these skills. Specifically for manipulation robots, the sense of touch is essential yet it is non-trivial to manually design a robot controller that combines different sensing modalities that have very different characteristics. I will present our set of research works that explore the question of how to best fuse the information from vision and touch for contact-rich manipulation tasks. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. We use self-supervision to learn a compact and multimodal representation of visual and haptic sensory inputs, which can then be used to improve the sample efficiency of policy learning. I present experiments on a peg insertion task where the learned policy generalises over different geometry, configurations, and clearances, while being robust to external perturbations. While this work has shown very promising results on fusing vision and touch into a learned latent representation, this representation is also not interpretable. In follow-up work, we present a multimodal fusion algorithm that exploits a differentiable filtering framework for tracking the state of manipulated objects and therefore facilitating longer horizon planning. We also propose a framework where a robot can exploit information from failed manipulation attempts to recover and re-try. And finally, we show how exploiting multiple modalities helps to compensate for corrupted sensory data in one of the modalities. I will conclude this talk with a discussion of appropriate representations for multimodal sensory data.



Jeannette Bohg is an Assistant Professor of Computer Science at Stanford University. She was a group leader at the Autonomous Motion Department (AMD) of the MPI for Intelligent Systems until September 2017. Before joining AMD in January 2012, Jeannette Bohg was a PhD student at the Division of Robotics, Perception and Learning (RPL) at KTH in Stockholm. In her thesis, she proposed novel methods towards multi-modal scene understanding for robotic grasping. She also studied at Chalmers in Gothenburg and at the Technical University in Dresden where she received her Master in Art and Technology and her Diploma in Computer Science, respectively. Her research focuses on perception and learning for autonomous robotic manipulation and grasping. She is specifically interested in developing methods that are goal-directed, real-time and multi-modal such that they can provide meaningful feedback for execution and learning. Jeannette Bohg has received several awards, most notably the 2019 IEEE International Conference on Robotics and Automation (ICRA) Best Paper Award, the 2019 IEEE Robotics and Automation Society Early Career Award and the 2017 IEEE Robotics and Automation Letters (RA-L) Best Paper Award.