David Fouhey receives NSF CAREER Award for vision system to perceive the interactive world

His goal is to build AI systems that can recognize and understand a 3D and interactive world from a single image.
Prof. David Fouhey
Prof. David Fouhey

Prof. David Fouhey has received a National Science Foundation (NSF) CAREER Award to build AI systems that can recognize and understand a 3D and interactive world from a single image. The project is entitled, “Learning to Perceive the Interactive 3D World from an Image.”

The CAREER Awards, among the most competitive offered by NSF, are typically given to fewer than 400 young scientists and engineers each year across all disciplines. According to the agency, they support “early-career faculty who have the potential to serve as academic role models in research and education and to lead advances in the mission of their department or organization.”

The human-built world is filled with interactive objects that have parts which can be manipulated by humans. In a kitchen, for instance, you will find cabinets with drawers and doors, appliances with doors, and perhaps doors to other rooms.  

In order for intelligent machines to be able to understand and assist humans in such settings, they must be able to recognize how these articulating objects can be used from vision. This understanding must include understanding the interactions as they occur, and also recognizing the opportunity for interaction, such as seeing a door and knowing that it can be opened.

These abilities are beyond the capabilities of current AI systems, which largely deal with interactive objects in restricted settings such as simulation engines. 

In this project, Fouhey aims to build AI systems that can recognize and understand a 3D and interactive world from a single image. This will advance the state of the art in perception by enabling intelligent systems to learn to perceive potential interactions with a scene, and will impact fields ranging from robotics to assistive technology for people, due to the ubiquity and importance of these interactive objects. 

Much of Fouhey’s and his students’ past work has covered either 3D recognition or affordance understanding. The proposed work finally integrates these two strands of reasoning with detailed quantitative 3D models that relate the two. Some of this past  work in 3D has included early work on estimating surface normals, volumetric 3D, and room-scale 3D from image-based cues. His lab has also pursued a number of projects related to understanding human hands in contact and human bodies in challenging Internet videos. More recently, his lab has branched out into computer vision for basic science, including work in solar physics and high-throughput bird measurements.

An expert in computer vision and machine learning, Fouhey joined CSE in January 2019. He received his PhD in Robotics from Carnegie Mellon University in 2016, after which he held positions as a postdoctoral researcher at UC Berkeley and as a visiting professor in the Willow Laboratory at INRIA Paris. He has been recognized as an outstanding reviewer at the CVPR 2018, NeurIPS 2019, ICCV 2019, and ECCV 2020 conferences. He is the director of AI4ALL at U-M, a two-week residential summer camp for high school students and has also supervised over a dozen undergraduate researchers.