A Leap in Time: Learning and Reasoning with Videos

Date(s):

Location:
Jacobs School of Engineering, 9500 Gilman Dr, La Jolla, San Diego, California 92093

Speaker(s):
Xiaolong Wang
Xiaolong Wang

Abstract:

The field of computer vision has been completely transformed by the success of deep Convolutional Neural Networks (ConvNets). State-of-the-art deep models have led to large advancements in visual tasks such as object detection and segmentation. One key ingredient behind this success is a large amount of human supervision for training ConvNets. However, can we really annotate every task we want to solve? As computer vision works towards more difficult and structured AI tasks, it becomes more challenging for humans to provide training supervision.
 
In this talk, I will argue that we need to go beyond images and exploit the spatial-temporal structure in videos. In videos, we have millions of pixels linked to each other by time. I will discuss how to learn this visual correspondence from continuous observations in videos without any human supervision. Once the correspondence is given, it can be utilized as supervision in training the ConvNets, eliminating the need for manual labels. Going beyond visual recognition, the spatial-temporal structure in videos also provides supervision signals for learning visual interactions. I will talk about our recent efforts on learning scene affordance by passively watching human interactions from videos, and learning visual navigation by actively interacting with the environment. 


Speaker Bio:
Xiaolong Wang is a final year Ph.D. student at The Robotics Institute at Carnegie Mellon University. He will join UC San Diego as an assistant professor in the ECE department in 2020 fall. His research interests focus on computer vision, specifically in the field of self-supervised learning, video understanding, learning common sense and interaction. He has collaborated with research labs including Berkeley AI Research, Facebook AI Research, and Allen Institute for Artificial Intelligence. He is the recipient of Facebook Fellowship, Nvidia Fellowship, and Baidu Fellowship.

Contact:
Beatriz Valenzuela bpvalenz@ucsd.edu