Fri 10/23
Yonatan Bisk headshot

LMSS @ Cornell Tech: Yonatan Bisk (Carnegie Mellon University)

ALFRED — A Simulated Playground for Connecting Language, Action, and Perception

Vision-and-Language Navigation has become a popular task in the grounding literature, but the real world includes interaction, state-changes, and long horizon planning.  (Actually, the real world requires motors and torques, but let’s ignore that for the moment.)  We present ALFRED (Action Learning From Realistic Environments and Directives) as a benchmark dataset with the goal of facilitating more complex embodied language understanding.  In this talk, I’ll discuss the benchmark itself and subsequent pieces of work enabled by the environment and annotations.  Our goal is to provide a playground for moving embodied language+vision research closer to robotics enabling the community to work on uncovering abstractions and interactions between planning, reasoning, and action taking.

Speaker Bio

Yonatan Bisk is an Assistant Professor at Carnegie Mellon University. Yonatan’s research area is Natural Language Processing (NLP) with a focus on grounding. In particular, his work broadly falls into: 1. Uncovering the latent structures of natural language, 2. Modeling the semantics of the physical world, and 3. Connecting language to perception and control.