What if robots could learn by simply watching humans? A new system using smart glasses can bring that idea closer to reality.

In recent decades, robots have appeared in places like malls, airports, hospitals, offices, and homes. To serve as helpers, they must handle tasks like cleaning, washing dishes, cooking, and doing laundry. Training robots for these tasks with machine learning is difficult, as it needs data or videos of humans doing the tasks. To solve this, researchers at New York University and UC Berkeley developed EgoZero, a system that collects first-person videos using Project Aria smart glasses by Meta.
EgoZero uses the motion of the smart glasses to generate 3D representations of the scene. It combines these with hand pose data from a hand estimation model. The system creates a set of state-action data points, which are used to train a Transformer-based policy that helps robots learn the tasks.
Unlike earlier systems, EgoZero does not need multiple cameras, wrist wearables, or motion capture gloves. Only smart glasses are used to collect the data.
EgoZero simplifies data collection for robot training. With 20 minutes of human demonstrations, the system can collect enough data to teach robots tasks. This makes data collection faster and more practical in natural settings.
By removing the need for teleoperation or robot-specific demonstrations, EgoZero allows robots to learn from human behavior. The system generates training data without the setup required by previous methods.
The Transformer-based policy trained by EgoZero operates in a closed loop, using 3D data points captured during demonstration. This helps the robot understand both the environment and the actions needed to complete tasks.
With EgoZero, researchers can train robots efficiently, reducing time and cost. The system’s ability to transfer human skills to robots without robot-specific data enables faster development of robots.
As more data becomes available, this approach could lead to robots that handle tasks in homes, offices, and public spaces. By simplifying training, EgoZero may speed up the adoption of robots as helpers.