arXiv

Expanding Spatial and Temporal Context for Robotic Imitation Learning With Scene Graphs

June 2, 2026 · Jianing Qian, Qinhe Peng, Emmanuel Panov, Leonor Fermoselle, Dinesh Jayaraman, Bernadette Bucher, Tarik Kelestemur · Original Source

Title: Enhancing Robotic Imitation Learning Through Scene Graphs for Broader Spatial and Temporal Awareness

Abstract: Imitation learning empowers robots to acquire task execution skills by observing demonstrations. However, practical settings such as offices and residential spaces frequently suffer from severe partial observability caused by their expansive spatial dimensions. Furthermore, numerous tasks require the performance of sequential subtasks, necessitating that autonomous robots maintain reasoning capabilities across prolonged timeframes. To overcome these obstacles, we introduce the integration of scene graphs as a structured, explicit memory framework within imitation learning. By sustaining a dynamic scene graph that records object-centric relationships and their temporal progression, our methodology enables agents to preserve pertinent historical context throughout task execution, thereby facilitating efficient reasoning based on progressively accumulated scene data. Evaluations conducted in simulated mobile manipulation scenarios and real-world tabletop manipulation tasks reveal that this strategy significantly boosts policy performance, especially in contexts requiring extended reasoning and strong generalization capabilities under conditions of partial observability.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC