Representation Learning

Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

We investigate representations from pre-trained text-to-image diffusion models for control tasks and showcase competitive performance across a wide range of tasks.

What Do We Learn from a Large-Scale Study of Pre-Trained Visual Representations in Sim and Real Environments?

We conduct a study on using pre-trained visual representations (PVRs) to train robots for real-world tasks.

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?

We present the largest and most comprehensive empirical study of visual foundation models for Embodied AI (EAI).

OVRL: Offline Visual Representation Learning for Embodied Navigation

In this work we propose OVRL, a two-stage representation learning strategy for visual navigation tasks in Embodied AI.