Robot Learning

Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

We investigate representations from pre-trained text-to-image diffusion models for control tasks and showcase competitive performance across a wide range of tasks.

What Do We Learn from a Large-Scale Study of Pre-Trained Visual Representations in Sim and Real Environments?

We conduct a study on using pre-trained visual representations (PVRs) to train robots for real-world tasks.

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?

We present the largest and most comprehensive empirical study of visual foundation models for Embodied AI (EAI).

HomeRobot: Open-Vocabulary Mobile Manipulation

We propose a combined simulation and real-world benchmark on the problem of Open-Vocabulary Mobile Manipulation (OVMM).